Introduction to R

Command Console

R provides a command console, which is where all code is processed. You can enter commands directly into the command console, or run them from a script. Both will result in whatever command being executed. For example, we can perform an operation such as

10*10
## [1] 100

and it will output the result. If we wanted to save the output, we assign this code to a variable, which is saved into the environment.

multiplication_variable <- 10*10
multiplication_variable2 = 10*10

So now we have two variables in our environment, multiplication_variable and multiplication_variable2. Both should be the same value, the only difference in how they were assigned. multiplication_variable was assigned with the <- operator, whereas multiplication_variable2 was assigned with the = operator.

We can use the == command to check whether two variables are equal. This is an equality sign, and will output either TRUE or FALSE (or a vector of TRUE and FALSE if working with vectors).

multiplication_variable == multiplication_variable2
## [1] TRUE

Confirming that the two variables are equal to each other!

Operators and Functions

R has many operators, too many to list here, but you can intuitively understand the basic operators such as divide (/), multiply (*), add (+) and subtract (-). Some other less common operators include the matrix multiply (%*%), integer division (%/%), integer modulus (%%) and exponentiate (^).

By default, R loads in a certain number of basic packages, including base, stats and utils. Through these packages, a large amount of functions are available, all useful. Other packages can be loaded by using the library function. For example, suppose I wanted to simulate from a multivariate Normal distribution. There is no package in base R to do this, but there is a function to do this in the MASS library. First, if this package is not installed then it needs to be done so by using install.packages("MASS") (which only needs to be done once). To load the library, we run

library(MASS)

Now the function should be available. To find out what arguments the function takes, and what to input to the function, we can look at its help file by running ?mvrnorm, this has a ‘Usage’ section detailing the following

mvrnorm(n = 1, mu, Sigma, tol = 1e-6, empirical = FALSE, EISPACK = FALSE)

n   - the number of samples required.
mu - a vector giving the means of the variables.
Sigma - a positive-definite symmetric matrix specifying the covariance matrix of the variables.
tol - tolerance (relative to largest variance) for numerical lack of positive-definiteness in Sigma.
empirical - logical. If true, mu and Sigma specify the empirical not population mean and covariance matrix.
EISPACK - logical: values other than FALSE are an error.

So the variables in mvrnorm we need to specify are n, mu and Sigma. Let’s run the function now

x = mvrnorm(5, mu = c(1,2), Sigma = matrix(c(1,1,1,1),2,2))

and we can look at this output by simply typing x:

x
##             [,1]      [,2]
## [1,]  0.97765673 1.9776567
## [2,]  0.49305126 1.4930513
## [3,] -0.06323688 0.9367631
## [4,]  0.77825704 1.7782570
## [5,]  2.23512186 3.2351219

Notice that in the specification to mvrnorm, two other functions were used; c and matrix. If you are curious about these, look at the help files for them. Packages and functions are key to using R effectively and efficiently.

Next