Search code examples
rmatlabmatrixlinear-algebraumfpack

LU decomposition on a sparse rectangular matrix in R


MATLAB is able to perform LU decomposition on a sparse rectangular matrix using [L, U, P, Q] = lu(A) but there is no R package to do so yet. Trying to use Matrix::lu() on a sparse matrix in R returns an error complaining that the matrix must be square.

Is there any analogue of MATLAB's LU decomposition on a rectangular matrix that is not full rank in R? To clarify, this is not about memory issues - if I embed it in an identity matrix (see this question) then Matrix::lu() complains about the matrix being singular, while MATLAB's lu() happily proceeds.

The problem I'm trying to solve is to implement a particular econometric estimator in R that needs the Moore-Penrose pseudo inverse. pracma's Moore-Penrose pseudo inverse fails because the matrix is too large. Getting a Moore-Penrose pseudo inverse of a large sparse matrix is also unsolved apparently, see here. My matrix has more than one 1 in any given row and column.


Solution

  • Solved. The following code uses UMFPACK, which is the same library MATLAB uses. To get the right elements, we need to add Control[UMFPACK_SCALE] = 0;. Then we need to fix the size of the resulting U matrix in R.

    #include <suitesparse/umfpack.h>
    #include <Rcpp.h>
    #include <algorithm>
    #include <vector>
    #include <iostream>
    
    // [[Rcpp::export]]
    Rcpp::List sparseLU(const std::vector<int> &Ap, const std::vector<int> &Ai, const std::vector<double> &Ax) {
      int n = Ap.size() - 1;
      int m = *std::max_element(Ai.begin(), Ai.end()) + 1;
      int nnz = Ax.size();
      double Control [UMFPACK_CONTROL], Info [UMFPACK_INFO];
      umfpack_di_defaults(Control);
      Control[UMFPACK_PRL] = 2;
      Control[UMFPACK_SCALE] = 0;
      Control[UMFPACK_STRATEGY] = UMFPACK_STRATEGY_SYMMETRIC;
    
      void *symbolic, *numeric;
      int status;
      status = umfpack_di_symbolic(m, n, Ap.data(), Ai.data(), Ax.data(), &symbolic, Control, Info);
      if (status != UMFPACK_OK) {
        umfpack_di_report_status(Control, status);
        if (status < 0) {
          Rcpp::stop("umfpack_di_symbolic failed with status %d", status);
        }
      }
    
      status = umfpack_di_numeric(Ap.data(), Ai.data(), Ax.data(), symbolic, &numeric, Control, Info);
      if (status != UMFPACK_OK) {
        umfpack_di_report_status(Control, status);
        if (status < 0) {
          Rcpp::stop("umfpack_di_numeric failed with status %d", status);
          umfpack_di_free_symbolic(&symbolic);
        }
      }
    
      umfpack_di_free_symbolic(&symbolic);
    
      int lnz, unz, nz_udiag;
      status = umfpack_di_get_lunz(&lnz, &unz, &m, &n, &nz_udiag, numeric);
      if (status != UMFPACK_OK) {
        umfpack_di_report_status(Control, status);
        if (status < 0) {
          Rcpp::stop("umfpack_di_get_lunz failed with status %d", status);
          umfpack_di_free_numeric(&numeric);
        }
      }
    
    
      std::vector<int> Lp(m+1), Lj(lnz), Up(n+1), Ui(unz), P(m), Q(n);
      std::vector<double> Lx(lnz), Ux(unz), D(std::min(m, n)), Rs(m);
      int do_recip;
    
      status = umfpack_di_get_numeric(Lp.data(), Lj.data(), Lx.data(), Up.data(), Ui.data(), Ux.data(), P.data(), Q.data(), D.data(), &do_recip, Rs.data(), numeric);
      if (status != UMFPACK_OK) {
        umfpack_di_report_status(Control, status);
        if (status < 0) {
          umfpack_di_free_numeric(&numeric);
          Rcpp::stop("umfpack_di_get_numeric failed with status %d", status);
        }
      }
    
      umfpack_di_free_numeric(&numeric);
    
      Rcpp::List ret = Rcpp::List::create(
          Rcpp::Named("L") = Rcpp::List::create(Rcpp::Named("j") = Lj, Rcpp::Named("p") = Lp, Rcpp::Named("x") = Lx),
          Rcpp::Named("U") = Rcpp::List::create(Rcpp::Named("i") = Ui, Rcpp::Named("p") = Up, Rcpp::Named("x") = Ux),
          Rcpp::Named("P") = P,
          Rcpp::Named("Q") = Q
          );
    
    
      return ret;
    }
    

    Then we can load it in R like this

    # A is a sparseMatrix() from package Matrix
    LUPQ <- sparseLU(A@p, A@i, A@x) 
    n <- ncol(A)
    L <- sparseMatrix(j = LUPQ$L$j, p = LUPQ$L$p, x = LUPQ$L$x, index1 = FALSE) 
    U <- sparseMatrix(i = LUPQ$U$i, p = LUPQ$U$p, x = LUPQ$U$x, index1 = FALSE, dims = c(n, n))
    Q <- sparseMatrix(i = LUPQ$Q + 1, j = 1:length(LUPQ$Q), x = 1)