Search code examples
matlabpcacoordinate-transformationbsxfun

How to project a new point to PCA new basis?


For example, I have 9 variables and 362 cases. I've made PCA calculation, and found out that first 3 PCA coordinates are enough for me.

Now, I have new point in my 9-dimensional structure, and I want to project it to principal component system coordinate. How to get its new coordinates?

%# here is data (362x9)
load SomeData

[W, Y] = pca(data, 'VariableWeights', 'variance', 'Centered', true);

%# orthonormal coefficient matrix
W = diag(std(data))\W;

% Getting mean and weights of data (for future data)
[data, mu, sigma] = zscore(data);
sigma(sigma==0) = 1;

%# New point in original 9dim system
%# For example, it is the first point of our input data
x = data(1,:);
x = bsxfun(@minus,x, mu);
x = bsxfun(@rdivide, x, sigma);

%# New coordinates as principal components
y0 = Y(1,:); %# point we should get in result
y = (W*x')'; %# our result

%# error
sum(abs(y0 - y)) %# 142 => they are not the same point

%# plot
figure()
plot(y0,'g'); hold on;
plot(y,'r');

enter image description here

How to get coordinates of a new point projected to new principal component basis?


Solution

  • Main fallacy was in operation that converts points to new basis:

    y = (W*x')';
    

    Wikipedia says:

    The projected vectors are the columns of the matrix

    Y = W*·Z, 
    

    where Y is L×N, W is M×L, Z is M×N,

    but pca() returns W of size L×M and Y of size NxL

    so, correct equation in Matlab is:

    y = x*W
    

    Below is the corrected code:

    [W, Y] = pca(data, 'VariableWeights', 'variance', 'Centered', true);
    W = diag(std(data))\W;
    
    %# Getting mean and weights of data (for future data)
    [~, mu, we] = zscore(data);
    we(we==0) = 1;
    
    %# New point in original 9dim system
    %# For example, it is the first point of our input data
    x = data(1,:); 
    x = bsxfun(@minus,x, mu);
    x = bsxfun(@rdivide, x, we);
    
    %# New coordinates as principal components
    y = x*W;
    y0 = Y(1,:);
    sum(abs(y0 - y)) %# 4.1883e-14 ~= 0