Question in the title. After calling pca.fit(X)
, suppose I called pca.fit_transform(new_X)
. Is new_X
automatically centered by PCA? The documentation is unclear on this point.
From the docs:
Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space. The input data is centered but not scaled for each feature before applying the SVD.
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
fit_transform
is just the equivalent of running fit
and transform
consecutively on the same input matrix. The fit
function calculates the means for centering the data, and the transform
function applies the mean centering using the means calculated during fit
.
Therefore to fit on one matrix, and apply the centering parameters learnt from that matrix to another (as, for example, when applying a model learnt on a training set to a test/validation set), you would need to use fit
and transform
separately.