I'm trying to implement PCA in Python, but when I go to annotate my figure with the principal axes, it appears that my vectors are not orthogonal.
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg as LA
N = 1000
xTrue = np.linspace(0, 1000, N)
yTrue = 4 * xTrue
xData = xTrue + np.random.normal(0, 100, N)
yData = yTrue + np.random.normal(0, 100, N)
xData = np.reshape(xData, (N, 1))
yData = np.reshape(yData, (N, 1))
Data = np.hstack((xData, yData))
C = np.cov(Data, rowvar = False)
e_vals, e_vecs = LA.eig(C)
e_vals = np.real(e_vals)
e_vecs = -e_vecs
avg = (1/Data.shape[0] )*sum(Data, 0)
arrowprops = dict(arrowstyle = '->', linewidth = 3, shrinkA = 0, shrinkB = 0, color = 'r')
plt.scatter(Data[:,0], Data[:,1])
plt.title("Principal Component Analysis Example: Linear Data")
for i in [0,1]:
ax = plt.gca()
ax.annotate('', avg + np.sqrt(e_vals[i])*e_vecs[:,i], avg, arrowprops = arrowprops)
I can verify that my eigenvectors are orthogonal by showing that np.matmul(e_vecs, e_vecs.T)
is roughly an identity matrix. However the image I get is the following:
Clearly the vectors are not orthogonal in the image, but it doesn't make sense why they aren't since translating them via the vector avg
shouldn't have eliminated this property. Anyone know what's wrong with this? Is this a scaling issue or am I missing some important parameter?
The vectors are mutually orthogonal, but when you plot them they do not appear orthogonal due to the scale of the plot (the x-axis range is much smaller than the y-axis range). This becomes clear if the aspect of the plot is set to 'eqaul'
via matplotlib.axes.Axes.set_aspect
:
for i in [0,1]:
ax = plt.gca()
ax.set_aspect('equal')
ax.annotate('', avg + np.sqrt(e_vals[i])*e_vecs[:,i], avg, arrowprops = arrowprops)
You can of course reverse the scaling for the vectors to force them to be visually orthogonal on a set of axis for which the aspect is non-equal, but that ultimately would not be representative of the actual data. I would recommend either setting the aspect to 'equal'
using set_aspect('equal')
as above or simply noting that the vectors are orthogonal but don't plot orthogonally due to the aspect of the plot.