Search code examples
rsparse-matrix

Efficiently Load A Sparse Matrix in R


I'm having trouble efficiently loading data into a sparse matrix format in R.

Here is an (incomplete) example of my current strategy:

library(Matrix)
a1=Matrix(0,5000,100000,sparse=T)
for(i in 1:5000)
  a1[i,idxOfCols]=x

Where x is usually around length 20. This is not efficient and eventually slows to a crawl. I know there is a better way but wasn't sure how. Suggestions?


Solution

  • You can populate the matrix all at once:

    library(Matrix)
    n <- 5000
    m <- 1e5
    k <- 20
    idxOfCols <- sample(1:m, k)
    x <- rnorm(k)
    
    a2 <- sparseMatrix(
      i=rep(1:n, each=k),
      j=rep(idxOfCols, n),
      x=rep(x, k),
      dims=c(n,m)
    )
    
    # Compare
    a1 <- Matrix(0,5000,100000,sparse=T)
    for(i in 1:n) {
      a1[i,idxOfCols] <- x
    }
    sum(a1 - a2) # 0