Search code examples
pythonoptimizationscipyscikit-learnmatrix-factorization

Computing Low-Rank approximation in Python


Given a matrix M of dimensions nxn, how can I compute a low rank factorization such that M = L.T * L, where L is of dimensions kxn. So far I've only seen this done using SVD, which isn't exactly what I want because the method gives me M = USV, and U.T != S*V, as opposed to (L.T).T == L.

Another alternative could be to use some form of optimization to find L, however it isn't straightforward because I've already tried several optimization methods from SciPy with the difference M - L.T * L under the frobenius norm, and so far I haven't been successful.

Edit: I forgot to add that by using scikit's Non-Negative Matrix Factorization class I'm able to achieve this partially by passing L and L.T as candidate matrices for the optimization. However, my matrix M is not non-negative, therefore this method doesn't work for me.


Solution

  • The answer depends on what you know about the matrix.

    If the matrix is positive semidefinite, you could use Cholesky Factorization, use pivoting for stability.

    Under other assumptions, a solution may not exist.


    An example where a solution may not exist, there is no solution for the following matrix:

    [[0, 1],
     [0, 0]] 
    

    Proof: Assume the answer exists. Then the solution looks like:

    L = [[a, b],
         [c, d]]
    

    So the following must be True:

    1. a*a + b*c == 0
    2. d*d + b*c == 0
    3. c * (a+d) == 0
    4. b * (a+d) == 1

    According to 3. (c == 0) or ((a+d) == 0)

    If c == 0, then according to 1. and 2. a == 0 and d == 0. If this is true, then (a+d) == 0 which makes 4. impossible.

    If (a+d) == 0 then 4. is impossible.

    By contradiction we know that there cannot be a decomposition you ask for with this matrix.