I am using Singular Value Decomposition (SVD) for Principal Component Analysis (PCA) of images.
I have 17 images of 20 X 20 so I created images matrix
M = dim(400 X 17)
and when I apply SVD ( M = u @ d @ v) it gives me
u = dim(400 X 17)
d = dim(17 X 17)
v = dim(17 X 17)
but I want to find u = dim(400 X 400)
and d =(400 X 400)
and v =(400 X 17)
because there will 400 eigen vectors and 400 eigen values.
I even tried doing transpose but no luck
I know maybe question's title not much clear so please change as you like, and here are some info related to data
I centralized data by subtracting mean face
I tried to work around by finding eigenvectors of covariance matrix(MM') but when I tried to show PCA1 it only shows a black image
Please help me out
Eigenvalues aren't defined for rectangular matrices, but the singular values are related. As for eigenvectors, you always have a set of right and left eigenvectors that span column and row space.
SVD is related to the eigenvalue decomposition of MM'
and M'M
M'M = V (S'S) V'
MM' = U (SS') U'
now
V
are eigenvectors of M'M
, which is of size (17 x 17)
in your case. Therefore V
is (17 x 17)
U
are eigenvectors of MM'
, which is of size (400 x 400)
in your case. Therefore U
is (400 x 400)
Now what's the size of S
? The non-zero elements of S
(singular values) are the square roots of the non-zero eigenvalues of M'M
and MM'
. It can bew shown these two have the same sets of non-zero eigenvalues, therefore in the first case S
is (17 x 17)
and in the second case (400 x 400)
. How do we reconcile this with the fact that our SVD is M = USV'
? We build a rectangular diagonal matrix (400 x 17)
with the square roots of the 17 non-zero eigenvalues.
You can use SVD from scipy
:
import scipy
u, s, vh = scipy.linalg.svd(M, full_matrices=True)
print(u.shape, s.shape, vh.shape)
that gives
((400, 400), (17,), (17, 17))
To get your S
to (400 x 17)
:
s = np.concatenate([np.diag(s), np.zeros((400-17, 17))], axis=0)
Check SVD correctness:
res = u@s@vh
np.allclose(res, a)
True
Sometimes you want to approximate your matrix M
with a low-rank M_tilde
of rank r
, in this case, if you want to minimize the Frobenius norm between the two, you just keep the r
largest singular values (Eckhart-Young theorem).
The sizes of U, S, V
become: (400 x r), (r x r), (r x 17)
, where S
is diagonal.
I don't know which function you're using, but this is what's happening: the zero singular values are being thrown away, as an (m x n)
matrix can have at most rank min(m, n)
(in your case 17).