Notes to myself
Eigenvectors are vectors that only scale (i.e., change in magnitude, not direction) when a given linear transformation (represented by a matrix) is applied to them. The scaling factor by which an eigenvector is multiplied when the transformation is applied is called an eigenvalue.
Given a square matrix any vector is considered an eigenvector of if is not the zero vector and there is some scalar such that applying to results in a scalar multiple of , i.e., the direction of remains unchanged. In equation form, this is written as: , where denotes the multiplication operation (either matrix multiplication or scalar multiplication, depending on context).
is the eigenvalue corresponding to the eigenvector in the above equation. It represents the scalar multiple by which the eigenvector is stretched or compressed (if you can’t recall linear transformations you can refer Khan Academy’s Matrix Transformations lecture for a refresher).
To find the eigenvalues of a matrix , we follow two steps. First we set up the characteristic equation, and then we solve for :
- ☝️ Characteristic Equation: You set up the equation , where represents the determinant of a matrix, and is the identity matrix of the same size as . This equation is derived from the eigenvector equation and the fact that is non-zero.
- ✌️ Solve for : Solving the characteristic equation will give you the eigenvalues of the matrix .
Once the eigenvalues are known, the eigenvectors can be found by:
- Substitution: For each eigenvalue , you substitute back into the equation (which can be rewritten as and solve for .
- Solving the System: Typically, you’ll get a system of linear equations for , which you’ll need to solve. Any non-zero vector that satisfies the system of equations is considered an eigenvector corresponding to the eigenvalue .
Let’s consider a 2x2 matrix
- Characteristic Equation: First, we find the determinant of
- Solving for : We solve to find the eigenvalues. The solution to this quadratic equation are the eigenvalues of , which are and .
Now comes the real magic. We can find the eigenvectors by plugging each eigenvalue into the equation and solving for . For :
The system simplifies to , so one eigenvector could be for .
Similarly, for , the system simplifies to , so one eigenvector could be for
This process reveals the eigenvalues and , with corresponding eigenvectors and , respectively. Each eigenvector is associated with one eigenvalue, and these vectors indicate the “directions” in which the linear transformation represented by matrix acts by stretching/compressing, without rotating.
Using numpy to find the eigenvalues and eigenvectors.
import numpy as np
A = np.array([[4, 1], [2, 3]])
## The eigenvectors are normalized so their Euclidean norms are 1.
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Matrix A:")
print(A)
print("\nEigenvalues:")
print(eigenvalues)
print("\nEigenvectors:")
print(eigenvectors)Output:
Matrix A:
[[4 1]
[2 3]]
Eigenvalues:
[5. 2.]
Eigenvectors:
[[ 0.70710678 -0.4472136 ]
[ 0.70710678 0.89442719]]
In this output, the eigenvalues are 5 and 2, which match the mathematical solution I calculated. The eigenvectors in numpy are normalized (i.e., their “unit length” of 1 in Euclidean space), so they may look different from the one I calculate by hand, but they are indeed pointing in the same directions.
The first eigenvector is approximately , which points in the same direction as , and the second eigenvector is approximately , which points in the same direction as . The direction is the critical property of the eigenvector, not the magnitude.
We can verify this by normalizing the vector. It involves dividing each component of the vector by its length. For example, suppose the vector is .
First, we calculate the magnitude () (Euclidean norm): Then, divide each component of the original vector by this magnitude:
Or just use numpy.
import numpy as np
## Define the original vectors
vectors = np.array([[1, 1], [1, -2]])
## Function to normalize a vector
def normalize_vector(vector):
# Calculate its magnitude (Euclidean norm)
magnitude = np.linalg.norm(vector)
normalized_vector = vector / magnitude
return normalized_vector
## Normalize the vectors and print the results
for vector in vectors:
normalized_vector = normalize_vector(vector)
print(f"Original vector: {vector}")
print(f"Normalized vector: {normalized_vector}\n")
Both approaches will give us the same normalized vector:
Output:
Original vector: [1 1]
Normalized vector: [0.70710678 0.70710678]
Original vector: [ 1 -2]
Normalized vector: [ 0.4472136 -0.89442719]
Dassit 👋