Basic matrix operators - Hello. I'm Ben Solomon.

Matrix multiplication

Requirements for matrix multiplication

To multiply matrices $R_{1}C_{1} * R_{2}C_{2}$, $C_{1}$ must equal $R_{2}$, resulting in a matrix with dimensions $R_{1}C_{2}$
```
x <- matrix(1:6, 3)
dim(x)
```
```
## [1] 3 2
```
```
y <- matrix(1:6, 2)
dim(y)
```
```
## [1] 2 3
```
```
dim(x %*% y)
```
```
## [1] 3 3
```

Mechanics of matrix multiplication

A matrix product computes the dot product of every row in the first matrix with every column in the second matrix
The dot product is calculated by multiplying each number in the first matrices row by its corresponding number in the second matrices column, then summing those values
Example: \[\left(\begin{array}{ccc}1 & 2 & 3 \\ 4 & 5 & 6\end{array}\right)\left(\begin{array}{cc}7 & 8 \\ 9 & 10 \\ 11 & 12\end{array}\right) = \left(\begin{array}{cc}58 & 64 \\ 139 & 154 \end{array}\right)\]
- These two matrices are compatible ($C_1 = 3 = R_2 = 3$)
- The resulting matrix with by 2 x 2 ($R_1 = 2$ x $C_2 = 2$)
- The dot product of the 1,1 position of the resulting matrix is the dot product of the first matrices 1st row and the second matrices 1st column \[\left(\begin{array}{ccc}1 & 2 & 3 \\ & & \end{array}\right)\left(\begin{array}{cc}7 & \\ 9 & \\ 11 & \end{array}\right) = \left(\begin{array}{cc}58 & \\ & \end{array}\right)\] \[(1*7) + (2*9) + (3*11) = 58\]
- The 1,2 position is the dot product of the first matrix 1st row and second matrix 2nd column
- The 2,1 position is the dot product of the first matrix 2nd row and second matrix 1st column
- THe 2,2 position is the dot product of the first matrix 2nd row and second matrix 2nd column
```
a <- matrix(1:6, 2, 3, byrow = T)
b <- matrix(7:12, 3, 2, byrow = T)
c1.1 <- sum(a[1,] * b[,1])
c1.2 <- sum(a[1,] * b[,2])
c2.1 <- sum(a[2,] * b[,1])
c2.2 <- sum(a[2,] * b[,2])
c <- matrix(c(c1.1,c1.2,c2.1,c2.2), 2, 2, byrow = T)
all(c == a %*% b)
```
```
## [1] TRUE
```

Transposition and the cross product

Transposing a matrix converts its rows into columns and visa versa

x <- matrix(1:6, 2)
x

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

#t(M) will transpose a matrix M
t(x)

##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
## [3,]    5    6

A frequent operation in statistics is to multiply the transpose of one matrix by a second matrix (e.g. $X^{T}Y$), this is called the cross product

x <- matrix(4:9, 2)
y <- matrix(1:6, 2)
t(x) %*% y

##      [,1] [,2] [,3]
## [1,]   14   32   50
## [2,]   20   46   72
## [3,]   26   60   94

#crossprod(M,N) will find the cross product between M and N
t(x) %*% y == crossprod(x,y)

##      [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE

#crossprod(M), when no second matrix is given, will find the cross product of a matrix and itself
crossprod(x)

##      [,1] [,2] [,3]
## [1,]   41   59   77
## [2,]   59   85  111
## [3,]   77  111  145

Note that, in the case of cross products, $R_{1}$ must equal $R_{2}$, not $C_{1}$ and $R_{2}$

Identity matrix

An identity matrix is a square matrix that, when multiplied by matrix $X$ will equal $x$
- An identity matrix is a matrix of 0s and 1s, where 1s are found at the diagonals
- Typically the identity matrix is used as $X*I = X$, so the number of rows and columns in the identity matrix are the same as the number of columns in $X$ ($C_{X} = R_{I} = C_{I}$)

#diag(n) will return an identity matrix with n = rows = columns
x <- matrix(1:6, 2)
I <- diag(ncol(x))
I

##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1

x %*% I == x

##      [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE

Inverse matrix

The inverse matrix such that, when a matrix is multiplied by its inverse, the result is its identity matrix ($XX^{-1}=I$)

Not every matrix has an inverse +How the inverse matrix is calculated is beyond this course, but involves the matrix determinant for square matrices and Gaussian elimination (and other methods) for general $n$ x $n$ matrices

#solve(M) finds the inverse matrix for matrix M
x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T)
solve(x)

##            [,1]        [,2]       [,3]
## [1,] 0.07692308  0.15384615  0.2307692
## [2,] 0.38461538 -0.23076923  0.1538462
## [3,] 0.53846154  0.07692308 -0.3846154

Solving a system of equations

Solving a system of linear equations is a common matrix algebra task
Example of a system of equations:

$\begin{aligned} a + b + c & = 6 \\ 3 a - 2 b + c & = 2 \\ 2 a + b - c & = 1 \end{aligned}$

This can be represented as a matrix equation:

$(\begin{matrix} 1 & 1 & 1 \\ 3 & - 2 & 1 \\ 2 & 1 & - 1 \end{matrix}) (\begin{matrix} a \\ b \\ c \end{matrix}) = (\begin{matrix} 6 \\ 2 \\ 1 \end{matrix}) ⟹ (\begin{matrix} a \\ b \\ c \end{matrix}) = {(\begin{matrix} 1 & 1 & 1 \\ 3 & - 2 & 1 \\ 2 & 1 & - 1 \end{matrix})}^{- 1} (\begin{matrix} 6 \\ 2 \\ 1 \end{matrix})$

The first matrix is called the co-variate matrix, the second is the variable matrix, and the third is the solution matrix
Notice each equation in the system is produced by the dot product of each row in the co-variate matrix and the (single) column of the variable matrix
We want to use the co-variate and solution matrices to solve for the values in the variable matrix
- Algebraically you would do this by dividing (or similarly multipying by a term to the negative first power)
- With matrices we have to multiply by the inverse (which is similar to mulitplying by a value to its negative first power)
- Thus $XY = Z$ can be rearranged to $Y = X^{-1}Z$ and matrix multiplication can then be used to solve for $Y$
```
x <- matrix(c(1,1,1,3,-2,1,2,1,-1), 3, byrow = T)
z <- matrix(c(6,2,1), 3)
y <- solve(x) %*% z
y
```
```
##      [,1]
## [1,]    1
## [2,]    2
## [3,]    3
```
```
#Alternatively, solve(X,Z) can be used so find Y
solve(x,z)
```
```
##      [,1]
## [1,]    1
## [2,]    2
## [3,]    3
```