Search code examples
clibrarieslinear-algebrasparse-matrixsuitesparse

Multiply by supernodal L in CHOLMOD?


How can I multiply by the cholmod_factor L in a supernodal L L^T factorisation? I'd prefer not to convert to simplicial since the supernodal representation results in faster backsolves, and I'd prefer not to make a copy of the factor since two copies might not fit in RAM.


Solution

  • I wound up understanding the supernodal representation from a nice comment in the supernodal-to-simplicial helper function in t_cholmod_change_factor.c. I paraphrase the comment and add some details below:

    A supernodal Cholesky factorisation is represented as a collection of supernodal blocks. The entries of a supernodal block are arranged in column-major order like this 6x4 supernode:

    t - - -    (row s[pi[snode+0]])
    t t - -    (row s[pi[snode+1]])
    t t t -    (row s[pi[snode+2]])
    t t t t    (row s[pi[snode+3]])
    r r r r    (row s[pi[snode+4]])
    r r r r    (row s[pi[snode+5]])
    
    • There are unused entries (indicated by the hyphens) in order to make the matrix rectangular.
    • The column indices are consecutive.
    • The first ncols row indices are those same consecutive column indices. Later row indices can refer to any row below the t triangle.
    • The super member has one entry for each supernode; it refers to the first column represented by the supernode.
    • The pi member has one entry for each supernode; it refers to the first index in the s member where you can look up the row numbers.
    • The px member has one entry for each supernode; it refers to the first index in the x member where the entries are stored. Again, this is not packed storage.

    The following code for multiplication by a cholmod_factor *L appears to work (I only care about int indices and double-precision real entries):

    cholmod_dense *mul_L(cholmod_factor *L, cholmod_dense *d) {
      int rows = d->nrow, cols = d->ncol;
      cholmod_dense *ans = cholmod_allocate_dense(rows, cols, rows,
          CHOLMOD_REAL, &comm);
      memset(ans->x, 0, 8 * rows * cols);
    
      FOR(i, L->nsuper) {
        int *sup = (int *)L->super;
        int *pi = (int *)L->pi;
        int *px = (int *)L->px;
        double *x = (double *)L->x;
        int *ss = (int *)L->s;
    
        int r0 =  pi[i], r1 =  pi[i+1], nrow = r1 - r0;
        int c0 = sup[i], c1 = sup[i+1], ncol = c1 - c0;
        int px0 = px[i];
    
        /* TODO: Use BLAS instead. */
        for (int j = 0; j < ncol; j++) {
          for (int k = j; k < nrow; k++) {
            for (int l = 0; l < cols; l++) {
              ((double *)ans->x)[l * rows + ss[r0 + k]] +=
                  x[px0 + k + j * nrow] * ((double *)d->x)[l*rows+c0 + j];
            }
          }
        }
      }
      return ans;
    }