My data matrix is X
which is 4999*37152
. Then I use this command in Matlab:
[coeff, score, latent, tsquared1, explained1] = pca(X);
The output: coeff
is 37152*4998
, score
is 4999*4998
, latent
is 4998*1
. According to http://www.mathworks.com/help/stats/pca.html, the coeff should be p*p. So what is wrong with my code ?
As Matlab documentation says, "Rows of X correspond to observations and columns correspond to variables". So you are feeding in a matrix with only 4999 observations for 37152 observations. Geometrically, you have 4999 points in a 37152-dimensional space. These points are contained in a 4998-dimensional affine subspace, so Matlab gets you 4998 directions there (each expressed as a vector with 37152 components).
For more, see the Statistics site:
The MATLAB documentation is written under assumption that you have at least as many observations as variables, which is how people normally use PCA.
Of course, it's possible that your data actually has 37152 observations for 4999 variables, in which case you need to transpose X.