What does the sklearn PCA to the input array when when the number of components is choose to be the same?

for example we have:

from sklearn.decomposition import PCA
import numpy as np 

xx = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
pca = PCA()
pca.fit_transform(xx)

otput:

array([[ 1.38340578,  0.2935787 ],
   [ 2.22189802, -0.25133484],
   [ 3.6053038 ,  0.04224385],
   [-1.38340578, -0.2935787 ],
   [-2.22189802,  0.25133484],
   [-3.6053038 , -0.04224385]])

In this case i am not reducing the size however the array is changed... why?

Solution

PCA does a linear (rotation) transformation of your feature space. In your case, assume feature 1 is along x and feature 2 is along y, the resulting transformation is the same as a rotating your feature vectors through an angle of theta ~ 2.565 radians. Below I've defined such a rotation matrix and show you get the same result:

import numpy as np
def rot_matrix(theta):
    # returns rotation matrix through angle theta
    rotation_matrix = np.dot(np.array([[np.cos(theta), -

np.sin(theta)], [np.sin(theta), np.cos(theta)]])
        return rotation_matrix

theta = 2.565
rot = rot_matrix(theta)
np.dot(rot, xx.T).T

result is (close to) the output of the PCA transform:

array([[ 1.38349574,  0.29315446],
       [ 2.22182084, -0.25201619],
       [ 3.60531658,  0.04113827],
       [-1.38349574, -0.29315446],
       [-2.22182084,  0.25201619],
       [-3.60531658, -0.04113827]])