I require M = np.log(1+M)
on a large 99.9% sparse 2D matrix.
How to perform this efficiently?
x, y = M.nonzero()
will retrieve coord-pairs of nonzero elements, but can I vectorize a log
operation over these pairs?
numpy doesn't seem to have sparse
support.
This is simplest:
import numpy as np
import scipy.sparse as sps
M = sps.csr_matrix(M)
M.data += 1
M.data = np.log(M.data)
If it's particularly large you could also log it in place (this prevents the full copy above):
M.data += 1
M.data=np.log(M.data,out=M.data)
Both of these options work on dense matrices as well with minor changes - if your matrix is 99.9% sparse I would start using actual sparse data structures though.
You could also use the where
argument on a dense array, but I doubt it would actually be any faster:
M = np.add(M, 1, out=M, where=M!=0)
M = np.log(M, out=M, where=M!=0)