Day 7: Linear Algebra - Matrices and Matrix Operations

Day 7: Linear Algebra - Matrices and Matrix Operations#

Introduction to Matrices#

What is a Matrix?#

Definition: A matrix is a rectangular array of numbers, symbols, or expressions arranged in rows and columns, usually represented using a capital letter such as $X$. Each entry in a matrix is called an element. A matrix is often denoted as:

\[\begin{split} X = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1n} \\ x_{21} & x_{22} & \cdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \cdots & x_{mn} \end{bmatrix} \end{split}\]

where $i$ is the number of rows, and $j$ is the number of columns.
Importance: Matrices have diverse applications across various fields, just like vectors. They are fundamental in mathematics, physics, engineering, computer science, and many other disciplines. Some key applications include:
- In mathematics: Matrices are used in linear algebra for solving systems of linear equations, eigenvalue problems, and transformations.
- In physics: Matrices are used to represent physical quantities, such as the moment of inertia tensor in mechanics and quantum state representations in quantum mechanics.
- In engineering: Matrices are employed to describe systems of equations in control theory, electrical circuits, and structural analysis.
- In computer science: Matrices are used for image processing, data compression, and graph algorithms, among other applications.

Matrices play a crucial role in various mathematical and computational operations, making them an essential concept in the study of linear algebra and related fields.

Mastering Matrices in Python#

Kickoff with NumPy

We’ll continue using NumPy since it’s fantastic for handling matrices (just like it was for vectors).

import numpy as np

import numpy as np

Creating Matrices#

Let’s start by creating a couple of matrices. Think of these as grids filled with numbers.

\[\begin{split} \begin{split} x1 = \left[ \begin{matrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{matrix} \right]^{2\times 3}\end{split} \begin{split} y1 = \left[ \begin{matrix} 7 & 8 & 9 \\ 10 & 11 & 12 \end{matrix} \right]^{2\times 3}\end{split} \nonumber \begin{split} z1 = \left[ \begin{matrix} 1 & 3 \\ 5 & 7 \\ 9 & 11 \end{matrix} \right]^{3\times 2}\end{split} \nonumber \end{split}\]

The subscript at the top isn’t necessary, however it is a nice reminder of the dimensions of the matrix are i x j (i.e. rows $i$ x columns $j$).

x1 = np.array([[1, 2, 3], 
               [4, 5, 6]])

y1 = np.array([[7, 8, 9], [10, 11, 12]])

x1 = np.array([[1, 2, 3], 
               [4, 5, 6]])
#Notice that either way that you declare your matrices, they are created the same and output the same.
y1 = np.array([[7, 8, 9], [10, 11, 12]])
z1 = np.array([[1, 3], [5, 7], [9, 11]])

print(x1)
print(x1.shape, '\n') #The *.shape after the variable, allows you to verify the shape of the matrix.
print(y1)
print(y1.shape, '\n')
print(z1)
print(z1.shape)

[[1 2 3]
 [4 5 6]]
(2, 3) 

[[ 7  8  9]
 [10 11 12]]
(2, 3) 

[[ 1  3]
 [ 5  7]
 [ 9 11]]
(3, 2)

Basic Matrix Operations: Addition and Subtraction#

Two matrices that have the same shape can be added or subtracted.

\[ (x1+y1)_{ij} = x1_{ij} + y1_{ij} \]

\[ (x1-y1)_{ij} = x1_{ij} - y1_{ij} \]

The subscript $i$ & $j$ are just representing the row & column coordinates within the matrix. The subscript $i$ & $j$ are just representing the row & column coordinates within the matrix.

m_add = np.add(x1, y1)
m_sub = np.subtract(x1, y1)

m_add = np.add(x1, y1) #Matrix addition
m_sub = np.subtract(x1, y1) #Matrix subtraction


print(m_add, '\n') #OR
print(x1+y1, '\n')

print(m_sub, '\n') #OR
print(x1-y1)
#The preference is to reference a variable that defines the operation rather than the operation within a print statement.

[[ 8 10 12]
 [14 16 18]] 

[[ 8 10 12]
 [14 16 18]] 

[[-6 -6 -6]
 [-6 -6 -6]] 

[[-6 -6 -6]
 [-6 -6 -6]]

Matrix Multiplication: The Core of Complex Calculations#

A matrix of any shape can be multiplied by a scalar (i.e. a number describing magnitude).

Matrix Multiplication:#

Definition: Matrix multiplication, denoted as $(Z = X \cdot Y$), is an operation between two matrices $(X$) ($(i \times j$)) and $(Y$) ($(k \times l$)), where the number of columns in $(X$) is equal to the number of rows in $(Y$). The resulting matrix $(C$) has dimensions ($(m \times n$)).

Mathematical Formula: The element $(Z[i][j]$) of the resulting matrix $(Z$) is obtained by taking the dot product of the $(i$)th row of matrix $(X$) and the $(l$) column of matrix $(Y$):

$[ Z[i][j] = \sum_{k=1}^{n} (X[i][k] \cdot Y[k][j]) $]

Key Rules:

Compatibility: For matrix multiplication to be defined, the number of columns in $(X$) must be equal to the number of rows in $(Y$) (i.e., the inner dimensions must match).
Associativity: Matrix multiplication is associative, meaning that $((X \cdot Y) \cdot Z = X \cdot (Y \cdot Z)$) if the dimensions allow.
Distributivity: Matrix multiplication distributes over matrix addition, i.e., $(X \cdot (Y + Z) = (X \cdot Y) + (X \cdot Z)$).

Matrix Dot Product (Element-wise Product):#

Definition: The matrix dot product, denoted as $(Z = X \cdot Y$) (or $(X \circ Y$)), is an element-wise operation between two matrices $(A$) and $(B$) of the same dimensions. The resulting matrix $(Z$) has the same dimensions as (X) and (Y).

Mathematical Formula: Each element $(Z[i][j]$) of the resulting matrix $(Z$) is obtained by multiplying the corresponding elements of matrices $(X$) and $(Y$):

$Z[i][j] = X[i][j] \cdot Y[i][j]]$

Key Rule: Matrix dot product is applied element-wise, meaning that each element of the resulting matrix is calculated independently based on the corresponding elements of the input matrices.

Use Cases: Matrix dot product is often used in element-wise operations, such as calculating element-wise differences or similarities between matrices. It is commonly used in various mathematical and computational tasks, including image processing and certain types of data transformations.

m_sca = 5 * x1 #Scalar multiplication
print(m_sca, '\n')

m_mul = x1@z1
print(m_mul,'\n')

#These two lines create random 4x4 matrices with integers 0-10
x2 = np.random.randint(0, 10, size = (4,4))
y2 = np.random.randint(0, 10, size = (4,4))
#These two lines show that the communitative property does not work with matrix multiplication
m_mul1 = x2@y2
m_mul2 = y2@x2

print(m_mul1, '\n')
print(m_mul2)

[[ 5 10 15]
 [20 25 30]] 

[[ 38  50]
 [ 83 113]] 

[[159  78  84  75]
 [152  56 122  64]
 [120  75  77  60]
 [108  32  68  28]] 

[[ 86 144  38 120]
 [ 66  60  18 120]
 [103 133  67 160]
 [ 84 116  60 107]]

Transpose and Inverse: Flipping and Reversing#

Transposing (flipping) and finding the inverse (kind of like a reverse) of a matrix are crucial operations in machine learning. $$ \begin{equation} X = \left[ \begin{array}{rrr} 1 & 3 & 5 \\ 7 & 9 & 11 \\ 13 & 15 & 17\end{array}\right] \hspace{1cm} X^T = \left[ \begin{array}{rrr} 1 & 7 & 13 \\ 3 & 9 & 15 \\ 5 & 11 & 17\end{array}\right] \hspace{1cm} \end{equation} $$

Transposing a Transpose:
- Transposing a matrix twice returns the original matrix.
- Mathematical Notation: $(X^T)^T = X$
Transposition of a Sum:
- The transpose of the sum of two matrices is equal to the sum of their transposes.
- Mathematical Notation: $(X + Y)^T = X^T + Y^T$
Transposition of a Product:
- The transpose of the product of two matrices is equal to the product of their transposes taken in reverse order.
- Mathematical Notation: $(XY)^T = (Y^T \cdot X^T)$
Transposition of a Scalar:
- Transposing a scalar (a single number) has no effect.
- Mathematical Notation: $(k)^T = k$

X = np.array([[1, 3, 5], [7, 9, 11], [13, 15, 17]])
X_T = X.transpose()

print(X,'\n')
print(X_T)

[[ 1  3  5]
 [ 7  9 11]
 [13 15 17]] 

[[ 1  7 13]
 [ 3  9 15]
 [ 5 11 17]]

Matrix Inverse Rules and Properties:#

Existence of an Inverse:
- Not all matrices have inverses. A matrix $(X$) has an inverse ($(X^{-1}$)) if and only if it is square and its determinant ($(det(X)$)) is nonzero.
Product with an Inverse:
- Multiplying a matrix by its inverse results in the identity matrix.
- Mathematical Notation: $(X \cdot X^{-1} = I$)
Order Matters:
- Matrix multiplication is not commutative. The order of multiplication matters when finding the inverse of a product.
- To find the inverse of $XY$, you may need to find the inverses of $X$ and $Y$ separately and then multiply them in reverse order: $((XY)^{-1} = Y^{-1} \cdot X^{-1}$).
Inverse of Transpose:
- The inverse of a transpose is the transpose of the inverse.
- Mathematical Notation: $((X^T)^{-1} = (X^{-1})^T$)
Inverse of a Scalar:
- The inverse of a nonzero scalar $k$ is $(1/k$).

These rules and properties are fundamental when working with matrices and are essential for various mathematical and computational tasks, including solving systems of linear equations, performing transformations, and more.

X = np.array([[1, 3, 5], [7, 9, 11], [13, 15, 17]])
X_inv = np.linalg.inv(X)

print(X, '\n')
print(X_inv)

[[ 1  3  5]
 [ 7  9 11]
 [13 15 17]] 

[[-2.81474977e+14  5.62949953e+14 -2.81474977e+14]
 [ 5.62949953e+14 -1.12589991e+15  5.62949953e+14]
 [-2.81474977e+14  5.62949953e+14 -2.81474977e+14]]

Activity: Matrix Operations#

Step-by-step Instructions#

Using numpy create two 4x4 matrices, named mat_X and mat_Y. $$ \begin{equation} mat_{X} = \left[ \begin{array}{rrr} 5 & 3 & 1 \\ 9 & 6 & 3 \\ 13 & 12 & 11\end{array}\right] \hspace{1cm} mat_{Y} = \left[ \begin{array}{rrr} 4 & 7 & 8 \\ 22 & 45 & 76 \\ 32 & 24 & 54\end{array}\right] \hspace{1cm} \end{equation} $$
Create a variable called mat_XY, multiplying mat_X and mat_Y together.
Create a variable called mat_XY_T, this new variable will be the transposition of mat_XY.
Create a variable called mat_XY_T2, this new variable will be a 2nd way that you can write the transposition of mat_XY.
Create a variable called mat_XY_inv, this new variable will be the inverse of the mat_XY_T variable
Display mat_XY and mat_XY_T, mat_XY_T2, and mat_XY_inv.

import numpy as np

mat_X = np.array([[5, 3, 1], [9, 6, 3], [13, 12, 11]])
mat_Y = np.array([[4, 7, 8], [22, 45, 76], [32, 24, 54]])

mat_XY = mat_X@mat_Y

mat_XY_T = mat_XY.T
mat_XY_T2 = np.transpose(mat_XY)

mat_XY_inv = np.linalg.inv(mat_XY_T)

print(mat_XY, '\n')
print(mat_XY_T, '\n')
print(mat_XY_T2, '\n')
print(mat_XY_inv)

[[ 118  194  322]
 [ 264  405  690]
 [ 668  895 1610]] 

[[ 118  264  668]
 [ 194  405  895]
 [ 322  690 1610]] 

[[ 118  264  668]
 [ 194  405  895]
 [ 322  690 1610]] 

[[-2.81926058e+13 -2.93203101e+13  2.79964834e+13]
 [ 1.97348241e+13  2.05242171e+13 -1.95975384e+13]
 [-2.81926058e+12 -2.93203101e+12  2.79964834e+12]]

Further Resources

Khan Academy - Linear Algebra:
- Description: Khan Academy offers a comprehensive course on linear algebra, including matrix operations, transformations, and applications. It covers the fundamentals and provides interactive exercises for practice.
- Link: Khan Academy - Linear Algebra
MIT OpenCourseWare - Introduction to Linear Algebra:
- Description: MIT’s OpenCourseWare provides free access to their course materials, including lecture notes, assignments, and video lectures for “Introduction to Linear Algebra.” This course covers matrix operations, determinants, and eigenvalues.
- Link: MIT OpenCourseWare - Introduction to Linear Algebra
Coursera - Mathematics for Machine Learning: Linear Algebra:
- Description: This Coursera course is part of the “Mathematics for Machine Learning” specialization and focuses on linear algebra concepts, including matrix operations. It’s suitable for those interested in the application of linear algebra in machine learning.
- Link: Coursera - Mathematics for Machine Learning: Linear Algebra