Search code examples
rmatrixsparse-matrix

R constructing sparse Matrix


I'm reading through instructions of Matrix package in R. But I couldn't understand the p argument in function:

sparseMatrix(i = ep, j = ep, p, x, dims, dimnames,
         symmetric = FALSE, index1 = TRUE,
         giveCsparse = TRUE, check = TRUE)

According to http://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/sparseMatrix.html

p:
numeric (integer valued) vector of pointers, one for each column (or row), to the initial (zero-based) index of elements in the column (or row). Exactly one of i, j or p must be missing.

I figured p is for compressed representation of either the row or column indices because it's wasteful to have multiple elements in either i or j to have the same value to represent a single row/column. But when I tried the example provided, I still couldn't figure out how p is controlling which element of x goes to which row/column

dn <- list(LETTERS[1:3], letters[1:5])
## pointer vectors can be used, and the (i,x) slots are sorted if necessary:
m <- sparseMatrix(i = c(3,1, 3:2, 2:1), p= c(0:2, 4,4,6), x = 1:6, dimnames = dn)

Solution

  • Just read a bit farther down in ?SparseMatrix to learn how p is interpreted. (In particular, note the bit about the "expanded form" of p.)

    If ‘i’ or ‘j’ is missing then ‘p’ must be a non-decreasing integer vector whose first element is zero. It provides the compressed, or “pointer” representation of the row or column indices, whichever is missing. The expanded form of ‘p’, ‘rep(seq_along(dp),dp)’ where ‘dp <- diff(p)’, is used as the (1-based) row or column indices.

    Here is a little function that will help you see what that means in practice:

    pex <- function(p) {
        dp <- diff(p)
        rep(seq_along(dp), dp)
    }
    
    ## Play around with the function to discover the indices encoded by p.
    pex(p = c(0,1,2,3))
    # [1] 1 2 3
    
    pex(p = c(0,0,1,2,3))
    # [1] 2 3 4
    
    pex(p = c(10,11,12,13))
    # [1] 1 2 3
    
    pex(p = c(0,0,2,5))
    # [1] 2 2 3 3 3
    
    pex(p = c(0,1,3,3,3,3,8))
    # [1] 1 2 2 6 6 6 6 6