Search code examples
rmatrixmergesparse-matrix

Merge two dgCMatrix sparse matrices of different size in R


How can I merge two large (around 500k columns and rows) sparse matrices of formal class dgCMatrix with different sizes (both columns and rows wise) in R?

Simplyfied example: I have a full 6x6 matrix

1 2 3 4 5 6
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0

Now I want to merge a second matrix of different size:

3 4 5 6
1 0 1 0 0 
3 0 0 1 0 
4 1 0 0 0 

The result should be:

1 2 3 4 5 6
1 0 0 0 1 0 0
2 0 0 0 0 0 0
3 0 0 0 0 1 0
4 1 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0

I tried cbindX and merge but both didn't work as either:

only matrices and data.frames can be used

or

cannot coerce class "*structure("dgCMatrix", package = "Matrix")" to a data.frame.

However, I could not change my matrix to sparse=FALSE matrix class as suggested here in this post or to a data.frame, as in this case R cannot handle the matrix size on my machine anymore.

Any help would be highly appreciated. Thanks!


Solution

  • One strategy would be to convert the matrices back to the same size and then add them.

    A <- sparseMatrix(8, 8, x = 1)
    B <- sparseMatrix(c(1,3,5), c(3,6,3), x = c(1,4,1))
    

    You can access the indices of matrix B with summary(B) and then just recreate the matrix with sparseMatrix(i,j,x,dims) like you would a normal subsetting operation in R:

    > summary(B)
    5 x 6 sparse Matrix of class "dgCMatrix", with 3 entries 
      i j x
    1 1 3 1
    2 5 3 1
    3 3 6 4
    
    B <- sparseMatrix(i = summary(B)$i, j = summary(B)$j, x = summary(B)$x, dims = dim(A))
    

    Then you can just add the matrices:

    A = A + B