# (Sparse) Matrices

## Matrices

A matrix is a two-dimensional data structure. The `matrix`

function is used to create matrices, and can have multiple arguments. You can specify the names of the columns and rows by supplying a list to the `dimnames`

argument, and can choose to populate the matrix by column (default) or by row with `byrow=TRUE`

. The function `as.matrix`

will convert a relevant argument to a matrix, and `is.matrix`

results a `TRUE`

or `FALSE`

if the argument is or isn’t a matrix. We can see these here:

```
A = matrix(1:12, nrow=3, ncol=4)
A
```

```
## [,1] [,2] [,3] [,4]
## [1,] 1 4 7 10
## [2,] 2 5 8 11
## [3,] 3 6 9 12
```

```
A = matrix(1:12, nrow=3, ncol=4, byrow=TRUE)
A
```

```
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 5 6 7 8
## [3,] 9 10 11 12
```

```
A = matrix(1:12, nrow=3, ncol=4,
dimnames = list(c("Row1", "Row2", "Row3"),
c("Column1", "Column2", "Column3", "Column4")))
A
```

```
## Column1 Column2 Column3 Column4
## Row1 1 4 7 10
## Row2 2 5 8 11
## Row3 3 6 9 12
```

Accessing elements in a matrix can be done by indexing over either the column or the row. `A[,i]`

will access the \(i\)-th column, and `A[i,]`

will access the \(i\)-th row. These arguments will return a vector, and will lose the structure of the matrix. For example, if we take the 1st column of `A`

we get

`A[,1]`

```
## Row1 Row2 Row3
## 1 2 3
```

which is not a column any more! This is important to consider when working with matrices. To keep the structure of the matrix intact, we can specify `drop=FALSE`

when indexing, e.g.

`A[,1,drop=FALSE]`

```
## Column1
## Row1 1
## Row2 2
## Row3 3
```

The `array`

function in R works like a ‘stack of matrices’, and any number of dimensions can be specified. Instead of the `matrix`

function, `array`

takes one argument corresponding to the dimension, which is a vector; each element being the length of the corresponding dimension, i.e.

`array(1:27,c(3,3,3))`

```
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 10 13 16
## [2,] 11 14 17
## [3,] 12 15 18
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 19 22 25
## [2,] 20 23 26
## [3,] 21 24 27
```

Matrix multiplication between more than 2 matrices can also be sped up by precisely choosing the location of your brackets. Since matrix multiplication works right-to-left, the brackets need to be on the right side. For a large matrix, if we test the speeds of two forms of multiplication, we get

```
N = 1000
M1 = matrix(rnorm(N^2),N,N)
M2 = matrix(rnorm(N^2),N,N)
M3 = matrix(rnorm(N^2),N,N)
system.time(M1 %*% M2 %*% M3)
```

```
## user system elapsed
## 0.462 0.172 0.194
```

`system.time(M1 %*% (M2 %*% M3))`

```
## user system elapsed
## 0.431 0.159 0.195
```

`all.equal(M1 %*% M2 %*% M3, M1 %*% (M2 %*% M3))`

`## [1] TRUE`

So specification of brackets is quite a bit faster, and can speed up computation times for larger problems. Note that the function `all.equal`

checks whether the two arguments are the same within some tolerance, as they are not exactly the same (see later section on Numerical types in R).

### Solving linear systems

Often a linear algebra problem we are interested in is solving \(A\boldsymbol{x} =\boldsymbol{b}\) for \(\boldsymbol{x},\boldsymbol{b} \in \mathbb{R}^n\) and \(A \in \mathbb{R}^{n\times n}\). One solution to this is simply \(\boldsymbol{x} = A^{-1}\boldsymbol{b}\), but the problem here is that *getting* the matrix inverse, as it will take of order \(n^3\) operations. For example, a 1000x1000 matrix (which is not uncommon) will take around 1000\(^3\) = 1,000,000,000 operations, which is inefficient. If you did want to solve the system this way, the function for inverting a matrix in R is `solve`

, e.g.

```
A = matrix(rnorm(9),3,3)
solve(A)
```

```
## [,1] [,2] [,3]
## [1,] -8.888322 2.915525 -1.461660
## [2,] 30.044457 -10.021586 3.193355
## [3,] -11.184011 3.327956 -1.104080
```

This function can also take a second argument, being the right hand side of the system, which in our case is \(b\). This will roughly be the same as `solve(A) %*% b`

.

```
b = c(1,2,3)
solve(A) %*% b
```

```
## [,1]
## [1,] -7.442254
## [2,] 19.581350
## [3,] -7.840339
```

`solve(A, b)`

`## [1] -7.442254 19.581350 -7.840339`

Although you can see the dimension of the output is different, `solve(A) %*% b`

maintains the column structure. However, the method `solve(A,b)`

is faster than `solve(A) %*% b`

.

### Numerical types in R

If we were to find the ‘type’ of a normal integer in R, we get

`typeof(2)`

`## [1] "double"`

What does it mean by a ‘double’? This means that it is a `binary64`

floating point number, i.e. the information stored in the computer for this value is stored in 64 bits; 1 bit for the *sign* of the number, 11 bits for the *exponent* and 52 bits for the *significant precision*. So the largest number we can store is `2^1023`

, since

`2^1024`

`## [1] Inf`

simply returns `Inf`

. We know this number isn’t **actually** infinity, but R recognises that it is too large, and anything over the largest number is stored as the highest possible value. This also means that really *small* numbers aren’t stored correctly either. There is always some form of **floating point error** in R, of order 2\(^{-52}\). Showing this in practice:

`print(1 + 2^(-52), digits=22)`

`## [1] 1.000000000000000222045`

`print(1 + 2^(-53), digits=22)`

`## [1] 1`

Another format R can store numbers in is in the format ‘Long’, specified by a `L`

after the number.

#### Effect on Matrices

Any form of arithmetic in R is going to be affected by floating point error. Most of the time it does not cause any problems though, as it will only affect things at *really small* or *really large* magnitudes. Matrices are specifically succeptible to floating point errors however, as matrix multiplication contains many operations.

Let’s look at some simple matrix multiplication on large matrices and inspect whether there are floating point errors.

```
N = 100 # Square Number
A = matrix(rnorm(N),sqrt(N),sqrt(N))
B = matrix(rnorm(N),sqrt(N),sqrt(N))
C = matrix(rnorm(N),sqrt(N),sqrt(N))
test = c(solve(A %*% (B %*% C)) - solve(A %*% B %*% C))
```

Since these two operations are the same, all entries in the matrix `test`

should be zero. However, this is not the case, as seen below.

`summary(test)`

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.774e-11 -1.969e-12 2.685e-13 3.719e-14 2.575e-12 1.345e-11
```

This is due to floating point error. Not all entries in `test`

are zero, but they are very small. Most of the time, this might not make much of a difference, but when performing calculations involving small numbers this is important to consider.

## Sparse Matrices

A sparse matrix is one where most of the entries are zero. The problem with sparse matrices in programming is that a very large matrix (for example a \(10000 \times 10000\) matrix), the computer would store *every* element in the matrix, even though most are zero. There are various package for dealing with sparse matrices in a better way in R, but the most popular is the `Matrix`

package. This package extends the base R functionality with both sparse and dense matrices, allolwing more operations and more efficient calculations. In this package, we can use the function `rankMatrix`

to return the rank of a input matrix. For example:

```
A = matrix(rnorm(25),5,5)
c(rankMatrix(A))
```

`## [1] 5`

A sparse matrix can be stored as a `dgCMatrix`

(where the `C`

stands for storing by column, other options are row or triplet). Let’s look at the difference between storing a sparse matrix in this way against the default way.

```
A = matrix(0, 1000, 1000)
for(i in 1:1000) A[sample(1:1000,50,1),i] = sample(1:10,50,replace=TRUE)
B = Matrix(A, sparse=TRUE)
A[1:4,1:15]
```

```
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
## [1,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [2,] 0 0 0 0 0 0 0 0 0 3 0 0 0 0
## [3,] 0 0 0 7 0 0 0 0 0 0 6 0 0 0
## [4,] 4 0 0 0 0 0 10 0 0 0 0 0 0 0
## [,15]
## [1,] 0
## [2,] 0
## [3,] 0
## [4,] 0
```

`B[1:4,1:15]`

```
## 4 x 15 sparse Matrix of class "dgCMatrix"
##
## [1,] . . . . . . . . . . . . . . .
## [2,] . . . . . . . . . 3 . . . . .
## [3,] . . . 7 . . . . . . 6 . . . .
## [4,] 4 . . . . . 10 . . . . . . . .
```

`c(object.size(A),object.size(B))`

`## [1] 8000216 590616`

So the sparse matrix is stored at a much smaller object size than a normal matrix. Note that the `Matrix`

function is part of the `Matrix`

package, not to be confused with `matrix`

from base R. The conversion to being stored as a sparse `dgCMatrix`

was done after construction of the matrix, but it could be constructed as a sparse matrix from the start. We can inspect the `dgCMatrix`

object.

`str(B)`

```
## Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
## ..@ i : int [1:48759] 3 15 46 61 70 72 87 116 144 170 ...
## ..@ p : int [1:1001] 0 50 100 148 197 245 294 344 394 444 ...
## ..@ Dim : int [1:2] 1000 1000
## ..@ Dimnames:List of 2
## .. ..$ : NULL
## .. ..$ : NULL
## ..@ x : num [1:48759] 4 10 5 6 6 8 6 9 8 4 ...
## ..@ factors : list()
```

This has a few pieces of information relating to the non-zero elements of the matrix `B`

. The often most interesting ones being the `i`

attribute: the locations of the non-zero entries, and the `x`

attribute: the non-zero entries in these corresponding spots.

## Example

Sparse matrices can have relevant application in many scenarios. For example, in a modelling problem where you want to model the effects of different categorical predictors, you can use ‘one-hot encoding’. This replaces a multi-class input \(\boldsymbol{x} \in \{0, 1, \dots, K \}^n\) with a vector of 1’s and 0’s, where the location of the 1’s correspond to which class is being represented. For example, if we set up a vector of factors in R, we have

```
x.factor = factor(sample(c("Class1", "Class2", "Class3", "Class4"), 20, replace=TRUE))
x.factor
```

```
## [1] Class1 Class4 Class3 Class1 Class2 Class1 Class3 Class3 Class2 Class2
## [11] Class3 Class3 Class1 Class1 Class1 Class2 Class2 Class2 Class2 Class3
## Levels: Class1 Class2 Class3 Class4
```

When fitting a model, we typically will add these entries to a model matrix, and work with that. The `model.matrix`

function in R will automatically set up one-hot encoding in this scenario.

```
M = model.matrix(~x.factor)
M
```

```
## (Intercept) x.factorClass2 x.factorClass3 x.factorClass4
## 1 1 0 0 0
## 2 1 0 0 1
## 3 1 0 1 0
## 4 1 0 0 0
## 5 1 1 0 0
## 6 1 0 0 0
## 7 1 0 1 0
## 8 1 0 1 0
## 9 1 1 0 0
## 10 1 1 0 0
## 11 1 0 1 0
## 12 1 0 1 0
## 13 1 0 0 0
## 14 1 0 0 0
## 15 1 0 0 0
## 16 1 1 0 0
## 17 1 1 0 0
## 18 1 1 0 0
## 19 1 1 0 0
## 20 1 0 1 0
## attr(,"assign")
## [1] 0 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$x.factor
## [1] "contr.treatment"
```

This matrix is primarily made up of 1’s and 0’s, and so would be better suited to being stored as a sparse matrix.

`Matrix(M, sparse=TRUE)`

```
## 20 x 4 sparse Matrix of class "dgCMatrix"
## (Intercept) x.factorClass2 x.factorClass3 x.factorClass4
## 1 1 . . .
## 2 1 . . 1
## 3 1 . 1 .
## 4 1 . . .
## 5 1 1 . .
## 6 1 . . .
## 7 1 . 1 .
## 8 1 . 1 .
## 9 1 1 . .
## 10 1 1 . .
## 11 1 . 1 .
## 12 1 . 1 .
## 13 1 . . .
## 14 1 . . .
## 15 1 . . .
## 16 1 1 . .
## 17 1 1 . .
## 18 1 1 . .
## 19 1 1 . .
## 20 1 . 1 .
```

In the case of a large data set, when using categorical variables, this will speed up computation time quite significantly.