Introduction to matrix operations
Matrix multiplication
Requirements for matrix multiplication
To multiply matrices \(R_{1}C_{1} * R_{2}C_{2}\), \(C_{1}\) must equal \(R_{2}\), resulting in a matrix with dimensions \(R_{1}C_{2}\)
x <- matrix(1:6, 3) dim(x)
## [1] 3 2
y <- matrix(1:6, 2) dim(y)
## [1] 2 3
dim(x %*% y)
## [1] 3 3
Mechanics of matrix multiplication
- A matrix product computes the dot product of every row in the first matrix with every column in the second matrix
- The dot product is calculated by multiplying each number in the first matrices row by its corresponding number in the second matrices column, then summing those values
- Example: \[\left(\begin{array}{ccc}1 & 2 & 3 \\ 4 & 5 & 6\end{array}\right)\left(\begin{array}{cc}7 & 8 \\ 9 & 10 \\ 11 & 12\end{array}\right) = \left(\begin{array}{cc}58 & 64 \\ 139 & 154 \end{array}\right)\]
- These two matrices are compatible (\(C_1 = 3 = R_2 = 3\))
- The resulting matrix with by 2 x 2 (\(R_1 = 2\) x \(C_2 = 2\))
- The dot product of the 1,1 position of the resulting matrix is the dot product of the first matrices 1st row and the second matrices 1st column \[\left(\begin{array}{ccc}1 & 2 & 3 \\ & & \end{array}\right)\left(\begin{array}{cc}7 & \\ 9 & \\ 11 & \end{array}\right) = \left(\begin{array}{cc}58 & \\ & \end{array}\right)\] \[(1*7) + (2*9) + (3*11) = 58\]
- The 1,2 position is the dot product of the first matrix 1st row and second matrix 2nd column
- The 2,1 position is the dot product of the first matrix 2nd row and second matrix 1st column
- THe 2,2 position is the dot product of the first matrix 2nd row and second matrix 2nd column
a <- matrix(1:6, 2, 3, byrow = T) b <- matrix(7:12, 3, 2, byrow = T) c1.1 <- sum(a[1,] * b[,1]) c1.2 <- sum(a[1,] * b[,2]) c2.1 <- sum(a[2,] * b[,1]) c2.2 <- sum(a[2,] * b[,2]) c <- matrix(c(c1.1,c1.2,c2.1,c2.2), 2, 2, byrow = T) all(c == a %*% b)
## [1] TRUE
Transposition and the cross product
Transposing a matrix converts its rows into columns and visa versa
x <- matrix(1:6, 2) x
## [,1] [,2] [,3] ## [1,] 1 3 5 ## [2,] 2 4 6
#t(M) will transpose a matrix M t(x)
## [,1] [,2] ## [1,] 1 2 ## [2,] 3 4 ## [3,] 5 6
A frequent operation in statistics is to multiply the transpose of one matrix by a second matrix (e.g. \(X^{T}Y\)), this is called the cross product
x <- matrix(4:9, 2) y <- matrix(1:6, 2) t(x) %*% y
## [,1] [,2] [,3] ## [1,] 14 32 50 ## [2,] 20 46 72 ## [3,] 26 60 94
#crossprod(M,N) will find the cross product between M and N t(x) %*% y == crossprod(x,y)
## [,1] [,2] [,3] ## [1,] TRUE TRUE TRUE ## [2,] TRUE TRUE TRUE ## [3,] TRUE TRUE TRUE
#crossprod(M), when no second matrix is given, will find the cross product of a matrix and itself crossprod(x)
## [,1] [,2] [,3] ## [1,] 41 59 77 ## [2,] 59 85 111 ## [3,] 77 111 145
Note that, in the case of cross products, \(R_{1}\) must equal \(R_{2}\), not \(C_{1}\) and \(R_{2}\)
Identity matrix
- An identity matrix is a square matrix that, when multiplied by matrix \(X\) will equal \(x\)
- An identity matrix is a matrix of 0s and 1s, where 1s are found at the diagonals
- Typically the identity matrix is used as \(X*I = X\), so the number of rows and columns in the identity matrix are the same as the number of columns in \(X\) (\(C_{X} = R_{I} = C_{I}\))
#diag(n) will return an identity matrix with n = rows = columns
x <- matrix(1:6, 2)
I <- diag(ncol(x))
I
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 1 0
## [3,] 0 0 1
x %*% I == x
## [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
Inverse matrix
- The inverse matrix such that, when a matrix is multiplied by its inverse, the result is its identity matrix (\(XX^{-1}=I\))
Not every matrix has an inverse +How the inverse matrix is calculated is beyond this course, but involves the matrix determinant for square matrices and Gaussian elimination (and other methods) for general \(n\) x \(n\) matrices
#solve(M) finds the inverse matrix for matrix M x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T) solve(x)
## [,1] [,2] [,3] ## [1,] 0.07692308 0.15384615 0.2307692 ## [2,] 0.38461538 -0.23076923 0.1538462 ## [3,] 0.53846154 0.07692308 -0.3846154
Solving a system of equations
- Solving a system of linear equations is a common matrix algebra task
Example of a system of equations:
- This can be represented as a matrix equation:
- The first matrix is called the co-variate matrix, the second is the variable matrix, and the third is the solution matrix
- Notice each equation in the system is produced by the dot product of each row in the co-variate matrix and the (single) column of the variable matrix
- We want to use the co-variate and solution matrices to solve for the values in the variable matrix
- Algebraically you would do this by dividing (or similarly multipying by a term to the negative first power)
- With matrices we have to multiply by the inverse (which is similar to mulitplying by a value to its negative first power)
- Thus \(XY = Z\) can be rearranged to \(Y = X^{-1}Z\) and matrix multiplication can then be used to solve for \(Y\)
x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T) z <- matrix(c(6,2,1), 3) y <- solve(x) %*% z y
## [,1] ## [1,] 1 ## [2,] 2 ## [3,] 3
#Alternatively, solve(X,Z) can be used so find Y solve(x,z)
## [,1] ## [1,] 1 ## [2,] 2 ## [3,] 3