Matrix multiplication

Requirements for matrix multiplication

  • To multiply matrices \(R_{1}C_{1} * R_{2}C_{2}\), \(C_{1}\) must equal \(R_{2}\), resulting in a matrix with dimensions \(R_{1}C_{2}\)

    x <- matrix(1:6, 3)
    dim(x)
    ## [1] 3 2
    y <- matrix(1:6, 2)
    dim(y)
    ## [1] 2 3
    dim(x %*% y)
    ## [1] 3 3

Mechanics of matrix multiplication

  • A matrix product computes the dot product of every row in the first matrix with every column in the second matrix
  • The dot product is calculated by multiplying each number in the first matrices row by its corresponding number in the second matrices column, then summing those values
  • Example: \[\left(\begin{array}{ccc}1 & 2 & 3 \\ 4 & 5 & 6\end{array}\right)\left(\begin{array}{cc}7 & 8 \\ 9 & 10 \\ 11 & 12\end{array}\right) = \left(\begin{array}{cc}58 & 64 \\ 139 & 154 \end{array}\right)\]
    • These two matrices are compatible (\(C_1 = 3 = R_2 = 3\))
    • The resulting matrix with by 2 x 2 (\(R_1 = 2\) x \(C_2 = 2\))
    • The dot product of the 1,1 position of the resulting matrix is the dot product of the first matrices 1st row and the second matrices 1st column \[\left(\begin{array}{ccc}1 & 2 & 3 \\ & & \end{array}\right)\left(\begin{array}{cc}7 & \\ 9 & \\ 11 & \end{array}\right) = \left(\begin{array}{cc}58 & \\ & \end{array}\right)\] \[(1*7) + (2*9) + (3*11) = 58\]
    • The 1,2 position is the dot product of the first matrix 1st row and second matrix 2nd column
    • The 2,1 position is the dot product of the first matrix 2nd row and second matrix 1st column
    • THe 2,2 position is the dot product of the first matrix 2nd row and second matrix 2nd column
    a <- matrix(1:6, 2, 3, byrow = T)
    b <- matrix(7:12, 3, 2, byrow = T)
    c1.1 <- sum(a[1,] * b[,1])
    c1.2 <- sum(a[1,] * b[,2])
    c2.1 <- sum(a[2,] * b[,1])
    c2.2 <- sum(a[2,] * b[,2])
    c <- matrix(c(c1.1,c1.2,c2.1,c2.2), 2, 2, byrow = T)
    all(c == a %*% b)
    ## [1] TRUE

Transposition and the cross product

  • Transposing a matrix converts its rows into columns and visa versa

    x <- matrix(1:6, 2)
    x
    ##      [,1] [,2] [,3]
    ## [1,]    1    3    5
    ## [2,]    2    4    6
    #t(M) will transpose a matrix M
    t(x)
    ##      [,1] [,2]
    ## [1,]    1    2
    ## [2,]    3    4
    ## [3,]    5    6
  • A frequent operation in statistics is to multiply the transpose of one matrix by a second matrix (e.g. \(X^{T}Y\)), this is called the cross product

    x <- matrix(4:9, 2)
    y <- matrix(1:6, 2)
    t(x) %*% y
    ##      [,1] [,2] [,3]
    ## [1,]   14   32   50
    ## [2,]   20   46   72
    ## [3,]   26   60   94
    #crossprod(M,N) will find the cross product between M and N
    t(x) %*% y == crossprod(x,y)
    ##      [,1] [,2] [,3]
    ## [1,] TRUE TRUE TRUE
    ## [2,] TRUE TRUE TRUE
    ## [3,] TRUE TRUE TRUE
    #crossprod(M), when no second matrix is given, will find the cross product of a matrix and itself
    crossprod(x)
    ##      [,1] [,2] [,3]
    ## [1,]   41   59   77
    ## [2,]   59   85  111
    ## [3,]   77  111  145
  • Note that, in the case of cross products, \(R_{1}\) must equal \(R_{2}\), not \(C_{1}\) and \(R_{2}\)

Identity matrix

  • An identity matrix is a square matrix that, when multiplied by matrix \(X\) will equal \(x\)
    • An identity matrix is a matrix of 0s and 1s, where 1s are found at the diagonals
    • Typically the identity matrix is used as \(X*I = X\), so the number of rows and columns in the identity matrix are the same as the number of columns in \(X\) (\(C_{X} = R_{I} = C_{I}\))
#diag(n) will return an identity matrix with n = rows = columns
x <- matrix(1:6, 2)
I <- diag(ncol(x))
I
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1
x %*% I == x
##      [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE

Inverse matrix

  • The inverse matrix such that, when a matrix is multiplied by its inverse, the result is its identity matrix (\(XX^{-1}=I\))
  • Not every matrix has an inverse +How the inverse matrix is calculated is beyond this course, but involves the matrix determinant for square matrices and Gaussian elimination (and other methods) for general \(n\) x \(n\) matrices

    #solve(M) finds the inverse matrix for matrix M
    x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T)
    solve(x)
    ##            [,1]        [,2]       [,3]
    ## [1,] 0.07692308  0.15384615  0.2307692
    ## [2,] 0.38461538 -0.23076923  0.1538462
    ## [3,] 0.53846154  0.07692308 -0.3846154

    Solving a system of equations

  • Solving a system of linear equations is a common matrix algebra task
  • Example of a system of equations:

a + b + c = 6 3 a 2 b + c = 2 2 a + b c = 1

  • This can be represented as a matrix equation:

( 1 1 1 3 2 1 2 1 1 ) ( a b c ) = ( 6 2 1 ) ( a b c ) = ( 1 1 1 3 2 1 2 1 1 ) 1 ( 6 2 1 )

  • The first matrix is called the co-variate matrix, the second is the variable matrix, and the third is the solution matrix
  • Notice each equation in the system is produced by the dot product of each row in the co-variate matrix and the (single) column of the variable matrix
  • We want to use the co-variate and solution matrices to solve for the values in the variable matrix
    • Algebraically you would do this by dividing (or similarly multipying by a term to the negative first power)
    • With matrices we have to multiply by the inverse (which is similar to mulitplying by a value to its negative first power)
    • Thus \(XY = Z\) can be rearranged to \(Y = X^{-1}Z\) and matrix multiplication can then be used to solve for \(Y\)
    x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T)
    z <- matrix(c(6,2,1), 3)
    y <- solve(x) %*% z
    y
    ##      [,1]
    ## [1,]    1
    ## [2,]    2
    ## [3,]    3
    #Alternatively, solve(X,Z) can be used so find Y
    solve(x,z)
    ##      [,1]
    ## [1,]    1
    ## [2,]    2
    ## [3,]    3