I want to get the row-column coordinates for all nonzero elements in a matrix M
. If M
isn't too big, it's straightforward:
m <- matrix(sample(0:1, 25, TRUE, prob=c(0.75, 0.25)), 5, 5)
#[,1] [,2] [,3] [,4] [,5]
#[1,] 0 0 0 0 0
#[2,] 1 1 0 0 0
#[3,] 0 0 0 1 0
#[4,] 0 0 1 0 0
#[5,] 0 0 0 0 0
nz <- which(m != 0)
cbind(row(m)[nz], col(m)[nz])
#[,1] [,2]
#[1,] 2 1
#[2,] 2 2
#[3,] 4 3
#[4,] 3 4
However, in my case M
is a sparse matrix (created using the Matrix package), whose dimensions can be very large. If I call row(M)
and col(M)
like above, I'll be generating a couple of dense matrices the same size as M
, which I definitely don't want to do.
Is there a way of getting a result like the above without creating dense matrices along the way?
I think you want
which(m!=0,arr.ind=TRUE)
Looking at showMethods("which")
, it seems that this is set up to work efficiently with sparse matrices. You can also get the answer more directly (but inscrutably) for a sparse, column-oriented matrix by manipulating the internal @p
(column pointer) and @i
(row pointer) slots:
mm <- Matrix(m)
dp <- diff(mm@p)
cbind(mm@i+1,rep(seq_along(dp),dp))