Chapter 2 Setting up R

2.1 Tutorial Video

Setting up R

Card image

A test drive of RGUI environments: concepts & rules

GO!

05-June-2020

2.2 Reference

R Operators

Operators in R can mainly be classified into Arithmetic, Relational, Logical and Assignment operators.

R Naming Conventions

There are a few general rules in R when naming Variables, Objects and Functions that you need follow:

  1. Names must start with a letter or a dot. If you start a name with a dot, the second character can’t be a digit

  2. Names should contain only letters, numbers, underscore characters (_), and dots (.). Although you can force R to accept other characters in names, you shouldn’t, because these characters often have a special meaning in R

  3. You can’t use the following special keywords as names: break, else, FALSE, for, function, if, Inf, NA, NaN, next, repeat, return, TRUE, while

FAQ

2.2.1 What is RGUI, good and bad

The RGui has a built-in interactive editor, namely the script windows that you have been opening to type in longer bits of code/command.

Good: It is best for beginners. Easy to use and understand. This course is mainly to use ‘Vanilla’ RGUI I introduced in tutorial video.

Bad: it has limited functionalities, as they do not have advanced ―syntax highlighting. Syntax highlighting is where the words in your code are divided into colors based on their function (for example, comments are in green, loop names are in red, etc). It is recommended that after getting comfortable with R, you download and use one of the editors like RStuido. This course will not cover the usage of other editors except for ‘Vanilla’ RGUI.

2.2.2 Can you tell me some Add-on packages in R

The R distribution you just installed comes with the following packages (just raise awareness) / /

base

Base R functions (and datasets before R 2.0.0).

compiler

R byte code compiler (added in R 2.13.0).

datasets

Base R datasets (added in R 2.0.0).

grDevices

Graphics devices for base and grid graphics (added in R 2.0.0).

graphics

R functions for base graphics.

grid

A rewrite of the graphics layout capabilities, plus some support for interaction.

methods

Formally defined methods and classes for R objects, plus other programming tools, as described in the Green Book.

parallel

Support for parallel computation, including by forking and by sockets, and random-number generation (added in R 2.14.0).

splines

Regression spline functions and classes.

stats

R statistical functions.

stats4

Statistical functions using S4 classes.

tcltk

Interface and language bindings to Tcl/Tk GUI elements.

tools

Tools for package development and administration.

utils

R utility functions.

2.2.3 what are packages available on my computer/AWS

To find out which packages are available on your system, type

library()

at the R prompt.

This produces something like

Packages in ‘/home/me/lib/R’:

mystuff       My own R functions, nicely packaged but not documented

Packages in ‘/usr/local/lib/R/library’:

KernSmooth    Functions for kernel smoothing for Wand & Jones (1995)
MASS          Main Package of Venables and Ripley's MASS
Matrix        Sparse and Dense Matrix Classes and Methods
base          The R Base package
boot          Bootstrap R (S-Plus) Functions (Canty)
class         Functions for Classification
cluster       Functions for clustering (by Rousseeuw et al.)
codetools     Code Analysis Tools for R
datasets      The R Datasets Package
foreign       Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat,
              dBase, ...
grDevices     The R Graphics Devices and Support for Colours and Fonts
graphics      The R Graphics Package
grid          The Grid Graphics Package
lattice       Lattice Graphics
methods       Formal Methods and Classes
mgcv          GAMs with GCV/AIC/REML smoothness estimation and GAMMs
              by PQL
nlme          Linear and Nonlinear Mixed Effects Models
nnet          Feed-forward Neural Networks and Multinomial Log-Linear
              Models
rpart         Recursive Partitioning
spatial       Functions for Kriging and Point Pattern Analysis
splines       Regression Spline Functions and Classes
stats         The R Stats Package
stats4        Statistical functions using S4 Classes
survival      Survival analysis, including penalised likelihood
tcltk         Tcl/Tk Interface
tools         Tools for Package Development
utils         The R Utils Package

2.2.4 load/unload the installed package

You can “load” the installed package pkg by

library(pkg)

You can then find out which functions it provides by typing one of

library(help = pkg)
help(package = pkg)

You can unload the loaded package pkg by

detach("package:pkg", unload = TRUE)

2.2.5 32 or 64-bit R or both

For our Dept., I assume you are using 64-bit Windows.

For most users we would recommend using the ‘native’ build, that is the 32-bit version on 32-bit Windows and the 64-bit version of 64-bit Windows.

The advantage of a native 64-bit application is that it gets a 64-bit address space and hence can address far more than 4GB (how much depends on the version of Windows, but in principle 8TB). This allows a single process to take advantage of more than 4GB of RAM (if available) and for R’s memory manager to more easily handle large objects (in particular those of 1GB or more). The disadvantages are that all the pointers are 8 rather than 4 bytes and so small objects are larger and more data has to be moved around, and that less external software is available for 64-bit versions of the OS. The 64-bit compilers are able to take advantage of extra features of all x86-64 chips (more registers, SSE2/3 instructions, …) and so the code may run faster despite using larger pointers. The 64-bit build is nowadays usually slightly faster than the 32-bit build on a recent CPU (Intel Core 2 or later or AMD equivalent).

For advanced users the choice may be dictated by whether the contributed packages needed are available in 64-bit builds (although CRAN only offers 32/64-bit builds). The considerations can be more complex: for example 32/64-bit RODBC need 32/64-bit ODBC drivers respectively, and where both exist they may not be able to be installed together. An extreme example is the Microsoft Access/Excel ODBC drivers: if you have installed 64-bit Microsoft Office you can only install the 64-bit drivers and so need to use 64-bit RODBC and hence R. (And similarly for 32-bit Microsoft Office.)

Obviously, only relevant if the machine is running a 64-bit version of Windows – simply select both when using the installer. You can also go back and add 64-bit components to a 32-bit install, or vice versa.

For many Registry items, 32- and 64-bit programs have different views of the Registry, but clashes can occur. The most obvious problem is the file association for .RData files, which will use the last installation for which this option is selected, and if that was for an installation of both, will use 64-bit R. To change the association the safest way is to edit the Registry entry ‘HKEY_CLASSES_ROOT’ and replace ‘x64’ by ‘i386’ or vice versa.