Search code examples
matrixbigdatarcppmemory-efficient

Efficient way of transforming a ddiMatrix to Bigmatrix in R


I've been trying, without success, to transform a ddiMatrix (diagonal matrix) into a Bigmatrix in R in an efficient way since my matrix is large.

The only way I manage to do it was passing it to matrix form first, and then converting it to a big.matrix form, as the example below.

library(Matrix)
library(bigmemory)

DM <- Diagonal(x = c(1,2,3,4)) 
DM <- as.big.matrix(as.matrix(DM))

This is very memory-consuming (therefore slow) as my benchmark metrics show.

Any idea? (Using Rcpp or anything else)


Solution

  • I fear this may be unavoidable as you are converting from sparse to dense matrices.

    Consider your example but with a slightly larger initial diagonal matrix, and an explicit intermediate step:

    > sparseDM <- Matrix::Diagonal(x = as.numeric(1:100))
    > denseM <- Matrix::as.matrix(sparseDM)
    > bigM <- bigmemory::as.big.matrix(denseM)
    > 
    > dang::ls.objects()    # simple helper function in a package of mine
                   Type  Size Rows Columns
    bigM     big.matrix   696  100     100
    denseM       matrix 80216  100     100
    sparseDM  ddiMatrix  2040  100     100
    > 
    

    As you can see, the sparse matrix is much smaller at 100 elements (plus some internal indexing) than the 100 x 100 dense matrix (with approximaly 100^2 * 8 bytes, plus the bit of SEXP overhead). The bigmemory object then looks small to R because its allocation is outside of R, but as a dense matrix it is still stored with all the non-diagonal elements you don't really care about.

    In short, it sounds like you want a 'bigmemory-alike sparse matrix' class. As far as I know, nobody has written one.