Search code examples
numpypcaeigenvalueeigenvector

Problem understanding Principal Component Analysis code


Can anyone please explain me this line of code? P = vectors.T.dot(C.T) at line 22

I have searched for online documentation but I found nothing.

from numpy import array
from numpy import mean
from numpy import cov
from numpy.linalg import eig

# define a matrix
A = array([[1, 2], [3, 4], [5, 6]])
print(A)
# calculate the mean of each column
M = mean(A.T, axis=1)
print(M)
# center columns by subtracting column means
C = A - M
print(C)
# calculate covariance matrix of centered matrix
V = cov(C.T)
print(V)
# eigendecomposition of covariance matrix
values, vectors = eig(V)
print(vectors)
print(values)
# project data
P = vectors.T.dot(C.T) # Explain me this line
print(P.T)

Solution

  • vectors.T.dot(C.T) is the dot product of the transposed array vectors with the transposed array C

    The dot product operation and projections are related as one can use the dot product to obtain the length of a projected vector along a direction (the other vector), when that vector is a unit vector.

    As your question is rather vague, I'll let you comment on this answer and adapt it if necessary.