Search code examples
mathmatlabsvddimension-reduction

Dimension Reduction


I'm trying to reduce a high-dimension dataset to 2-D. However, I don't have access to the whole dataset upfront. So, I'd like to generate a function that takes an N-dimensional vector and returns a 2-dimensional vector, such that if I give it to vectors that are close in N-dimensional space, the results are close in 2-dimensional space.

I thought SVD was the answer I needed, but I can't make it work.

For simplicity, let N=3 and suppose I have 15 datapoints. If I have all the data upfront in a 15x3 matrix X, then:

[U, S, V] = svd(X);
s = S; %s is a the reduced version of S, since matlab is case-sensitive.
s(3:end,3:end)=0;
Y=U*s;
Y=Y(1:2,:);

does what I want. But suppose I get a new datapoint, A, a 1x3 vector. Is there a way to use U, S, or V to turn A into the appropriate 1x2 vector?

If SVD is a lost cause, can someone tell me what I should be doing instead?

Note: This is Matlab code, but I don't care if the answer is C, Java, or just math. If you can't read Matlab, ask and I'll clarify.


Solution

  • SVD is a fine approach (probably). LSA (Latent Semantic Analysis) is based around it, and has basically the same dimensionality approach. I've talked about that (at length) at: lsa-latent-semantic-analysis-how-to-code-it-in-php or check out the LSA tag here on SO.

    I realize it's an incomplete answer. Holler if you want more help!