### Matrix multiplication

#### Requirements for matrix multiplication

• To multiply matrices $$R_{1}C_{1} * R_{2}C_{2}$$, $$C_{1}$$ must equal $$R_{2}$$, resulting in a matrix with dimensions $$R_{1}C_{2}$$

x <- matrix(1:6, 3)
dim(x)
## [1] 3 2
y <- matrix(1:6, 2)
dim(y)
## [1] 2 3
dim(x %*% y)
## [1] 3 3

#### Mechanics of matrix multiplication

• A matrix product computes the dot product of every row in the first matrix with every column in the second matrix
• The dot product is calculated by multiplying each number in the first matrices row by its corresponding number in the second matrices column, then summing those values
• Example: $\left(\begin{array}{ccc}1 & 2 & 3 \\ 4 & 5 & 6\end{array}\right)\left(\begin{array}{cc}7 & 8 \\ 9 & 10 \\ 11 & 12\end{array}\right) = \left(\begin{array}{cc}58 & 64 \\ 139 & 154 \end{array}\right)$
• These two matrices are compatible ($$C_1 = 3 = R_2 = 3$$)
• The resulting matrix with by 2 x 2 ($$R_1 = 2$$ x $$C_2 = 2$$)
• The dot product of the 1,1 position of the resulting matrix is the dot product of the first matrices 1st row and the second matrices 1st column $\left(\begin{array}{ccc}1 & 2 & 3 \\ & & \end{array}\right)\left(\begin{array}{cc}7 & \\ 9 & \\ 11 & \end{array}\right) = \left(\begin{array}{cc}58 & \\ & \end{array}\right)$ $(1*7) + (2*9) + (3*11) = 58$
• The 1,2 position is the dot product of the first matrix 1st row and second matrix 2nd column
• The 2,1 position is the dot product of the first matrix 2nd row and second matrix 1st column
• THe 2,2 position is the dot product of the first matrix 2nd row and second matrix 2nd column
a <- matrix(1:6, 2, 3, byrow = T)
b <- matrix(7:12, 3, 2, byrow = T)
c1.1 <- sum(a[1,] * b[,1])
c1.2 <- sum(a[1,] * b[,2])
c2.1 <- sum(a[2,] * b[,1])
c2.2 <- sum(a[2,] * b[,2])
c <- matrix(c(c1.1,c1.2,c2.1,c2.2), 2, 2, byrow = T)
all(c == a %*% b)
## [1] TRUE

### Transposition and the cross product

• Transposing a matrix converts its rows into columns and visa versa

x <- matrix(1:6, 2)
x
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
#t(M) will transpose a matrix M
t(x)
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
## [3,]    5    6
• A frequent operation in statistics is to multiply the transpose of one matrix by a second matrix (e.g. $$X^{T}Y$$), this is called the cross product

x <- matrix(4:9, 2)
y <- matrix(1:6, 2)
t(x) %*% y
##      [,1] [,2] [,3]
## [1,]   14   32   50
## [2,]   20   46   72
## [3,]   26   60   94
#crossprod(M,N) will find the cross product between M and N
t(x) %*% y == crossprod(x,y)
##      [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE
#crossprod(M), when no second matrix is given, will find the cross product of a matrix and itself
crossprod(x)
##      [,1] [,2] [,3]
## [1,]   41   59   77
## [2,]   59   85  111
## [3,]   77  111  145
• Note that, in the case of cross products, $$R_{1}$$ must equal $$R_{2}$$, not $$C_{1}$$ and $$R_{2}$$

### Identity matrix

• An identity matrix is a square matrix that, when multiplied by matrix $$X$$ will equal $$x$$
• An identity matrix is a matrix of 0s and 1s, where 1s are found at the diagonals
• Typically the identity matrix is used as $$X*I = X$$, so the number of rows and columns in the identity matrix are the same as the number of columns in $$X$$ ($$C_{X} = R_{I} = C_{I}$$)
#diag(n) will return an identity matrix with n = rows = columns
x <- matrix(1:6, 2)
I <- diag(ncol(x))
I
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1
x %*% I == x
##      [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE

### Inverse matrix

• The inverse matrix such that, when a matrix is multiplied by its inverse, the result is its identity matrix ($$XX^{-1}=I$$)
• Not every matrix has an inverse +How the inverse matrix is calculated is beyond this course, but involves the matrix determinant for square matrices and Gaussian elimination (and other methods) for general $$n$$ x $$n$$ matrices

#solve(M) finds the inverse matrix for matrix M
x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T)
solve(x)
##            [,1]        [,2]       [,3]
## [1,] 0.07692308  0.15384615  0.2307692
## [2,] 0.38461538 -0.23076923  0.1538462
## [3,] 0.53846154  0.07692308 -0.3846154

### Solving a system of equations

• Solving a system of linear equations is a common matrix algebra task
• Example of a system of equations:

$\begin{array}{rl}a+b+c& =6\\ 3a-2b+c& =2\\ 2a+b-c& =1\end{array}$

• This can be represented as a matrix equation:

$\left(\begin{array}{ccc}1& 1& 1\\ 3& -2& 1\\ 2& 1& -1\end{array}\right)\left(\begin{array}{c}a\\ b\\ c\end{array}\right)=\left(\begin{array}{c}6\\ 2\\ 1\end{array}\right)\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}\left(\begin{array}{c}a\\ b\\ c\end{array}\right)={\left(\begin{array}{ccc}1& 1& 1\\ 3& -2& 1\\ 2& 1& -1\end{array}\right)}^{-1}\left(\begin{array}{c}6\\ 2\\ 1\end{array}\right)$

• The first matrix is called the co-variate matrix, the second is the variable matrix, and the third is the solution matrix
• Notice each equation in the system is produced by the dot product of each row in the co-variate matrix and the (single) column of the variable matrix
• We want to use the co-variate and solution matrices to solve for the values in the variable matrix
• Algebraically you would do this by dividing (or similarly multipying by a term to the negative first power)
• With matrices we have to multiply by the inverse (which is similar to mulitplying by a value to its negative first power)
• Thus $$XY = Z$$ can be rearranged to $$Y = X^{-1}Z$$ and matrix multiplication can then be used to solve for $$Y$$
x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T)
z <- matrix(c(6,2,1), 3)
y <- solve(x) %*% z
y
##      [,1]
## [1,]    1
## [2,]    2
## [3,]    3
#Alternatively, solve(X,Z) can be used so find Y
solve(x,z)
##      [,1]
## [1,]    1
## [2,]    2
## [3,]    3