
Matrices
Matrices are commonly used in mathematics and statistics, and much of R's power comes from the various operations you can perform with them. In R, a matrix is a vector with two additional attributes, the number of rows and the number of columns. And, since matrices are vectors, they are constrained to a single data type.
You can use the matrix() function to create matrices. You may pass it a vector of values, as well as the number of rows and columns the matrix should have. If you specify the vector of values and one of the dimensions, the other one will be calculated for you automatically to be the lowest number that makes sense for the vector you passed. However, you may specify both of them simultaneously if you prefer, which may produce different behavior depending on the vector you passed, as can be seen in the next example.
By default, matrices are constructed column-wise, meaning that the entries can be thought of as starting in the upper-left corner and running down the columns. However, if you prefer to construct it row-wise, you can send the byrow = TRUE parameter. Also, note that you may create an empty or non-initialized matrix, by specifying the number of rows and columns without passing any actual data for its construction, and if you don't specify anything at all, an uninitialized 1-by-1 matrix will be returned. Finally, note that the same element-repetition mechanism we saw for vectors is applied when creating matrices, so do be careful when creating them this way:
matrix()
#> [,1]
#> [1,] NA
matrix(nrow = 2, ncol = 3)
#> [,1] [,2] [,3]
#> [1,] NA NA NA
#> [2,] NA NA NA
matrix(c(1, 2, 3), nrow = 2)
#> Warning in matrix(c(1, 2, 3), nrow = 2):
data length [3] is not a sub-
#> multiple or multiple of the number of rows [2]
#> [,1] [,2]
#> [1,] 1 3
#> [2,] 2 1
matrix(c(1, 2, 3), nrow = 2, ncol = 3)
#> [,1] [,2] [,3]
#> [1,] 1 3 2
#> [2,] 2 1 3
matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = TRUE)
#> [,1] [,2] [,3]
#> [1,] 1 2 3
#> [2,] 4 5 6
Matrix subsets can be specified in various ways. Using matrix-like notation, you can specify the row and column selection using the same mechanisms we showed before for vectors, with which you can use vectors with indexes or vectors with logicals, and in case you decide to use vectors with logicals the vector used to subset must be of the same length as the matrix's dimension you are using it for. Since in this case, we have two dimensions to work with, we must separate the selection for rows and columns by using a comma (,) between them (row selection goes first), and R will return their intersection.
For example, x[1, 2] tells R to get the element in the first row and the second column, x[1:2, 1] tells R to get the first through second elements of the third row, which is equivalent to using x[c(1, 2), 3]. You may also use logical vectors for the selection. For example, x[c(TRUE, FALSE), c(TRUE, FALSE, TRUE)] tells R to get the first row while avoiding the second one, and from that row, to get the first and third columns. An equivalent selection is x[1, c(1, 3)]. Note that when you want to specify a single row or column, you can use an integer by itself, but if you want to specify two or more, then you must use vector notation. Finally, if you leave out one of the dimension specifications, R will interpret as getting all possibilities for that dimension:
x <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3, byrow = TRUE)
x[1, 2]
#> [1] 2
x[1:2, 2]
#> [1] 2 5
x[c(1, 2), 3]
#> [1] 3 6
x[c(TRUE, FALSE), c(TRUE, FALSE, TRUE)]
#> [1] 1 3
x[1, c(1, 3)]
#> [1] 1 3
x[, 1]
#> [1] 1 4
x[1, ]
#> [1] 1 2 3
As mentioned earlier, matrices are basic mathematical tools, and R gives you a lot of flexibility when working with them. The most common matrix operation is transposition, which is performed using the t() function, and matrix-vector multiplication, vector-matrix multiplication, and matrix-matrix multiplication, which are performed with the %*% operator we used previously to calculate the dot-product of two vectors.
Note that the same dimensionality restrictions apply as with mathematical notation, meaning that in case you try to perform one of these operations and the dimensions don't make mathematical sense, R will throw an error, as can be seen in the last part of the example:
A <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = TRUE)
x <- c(7, 8)
y <- c(9, 10, 11)
A
#> [,1] [,2] [,3]
#> [1,] 1 2 3
#> [2,] 4 5 6
x
#> [1] 7 8
y
#> [1] 9 10 11
t(A)
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 2 5
#> [3,] 3 6
t(x)
#> [,1] [,2]
#> [1,] 7 8
t(y)
#> [,1] [,2] [,3]
#> [1,] 9 10 11
x %*% A
#> [,1] [,2] [,3]
#> [1,] 39 54 69
A %*% t(x)
#> Error in A %*% t(x): non-conformable arguments
A %*% y
#> [,1]
#> [1,] 62
#> [2,] 152
t(y) %*% A
#> Error in t(y) %*% A: non-conformable arguments
A %*% t(A)
#> [,1] [,2]
#> [1,] 14 32
#> [2,] 32 77
t(A) %*% A
#> [,1] [,2] [,3]
#> [1,] 17 22 27
#> [2,] 22 29 36
#> [3,] 27 36 45
A %*% x
#> Error in A %*% x: non-conformable arguments