My question is quite elementary but I need help understanding the basic concepts.
In the following example from the Mathworks documentation page
of the princomp
function
load hald;
[pc,score,latent,tsquare] = princomp(ingredients);
pc,latent
we get the following values for:
pc =
-0.0678 -0.6460 0.5673 0.5062
-0.6785 -0.0200 -0.5440 0.4933
0.0290 0.7553 0.4036 0.5156
0.7309 -0.1085 -0.4684 0.4844
latent =
517.7969
67.4964
12.4054
0.2372
score =
36.8218 -6.8709 -4.5909 0.3967
29.6073 4.6109 -2.2476 -0.3958
-12.9818 -4.2049 0.9022 -1.1261
23.7147 -6.6341 1.8547 -0.3786
-0.5532 -4.4617 -6.0874 0.1424
-10.8125 -3.6466 0.9130 -0.1350
-32.5882 8.9798 -1.6063 0.0818
22.6064 10.7259 3.2365 0.3243
-9.2626 8.9854 -0.0169 -0.5437
-3.2840 -14.1573 7.0465 0.3405
9.2200 12.3861 3.4283 0.4352
-25.5849 -2.7817 -0.3867 0.4468
-26.9032 -2.9310 -2.4455 0.4116
Legend:
latent is a vector containing the eigenvalues of the covariance matrix of X.
pc is a p-by-p matrix, each column containing coefficients for one principal component. The columns are in order of decreasing component variance.**
score is the principal component scores; that is, the representation of X in the principal component space. Rows of SCORE correspond to observations, columns to components.
Can somebody explain whether the values of score are genetrated somehow using the values of pc and if this true, what kind of computation is perfomed ?
Yes, it holds that score = norm_ingredients * pc
, where norm_ingredients
is the normalized version of your input matrix so that its columns have zero mean, that is,
norm_ingredients = ingredients - repmat(mean(ingredients), size(ingredients, 1), 1)