Search code examples
scikit-learnpca

scikit learn PCA - transform results


I have a timeseries of first differences onto which i apply PCA using scikit to get the first PC

# data is a timeseries of first differences
pca = PCA(n_components=1)
pca.fit(data)
pc1_trans = pca.transform(data)
pc1_dot = numpy.dot( data, pca.components_.T)
plt.plot( numpy.cumsum( pc1_dot )  )
plt.plot( numpy.cumsum( pc1_trans ) ) 

i thought the result of the dot product (projection) between the original data and the first components would yield the same result as calling pca.transform but this is not the case (results below; orange line is the data from transform). Why is this? enter image description here


Solution

  • Found the answer here

    scikit PCA shows you the transform on the de-meaned data, so these are equivalent:

    pc1_trans = pca.transform(data)
    pc1_dot = numpy.dot( data - data.mean(), pca.components_.T)