I'm working with large sparse matrices in Python. The representation of my matrix gives me the number of stored elements, for example
<100000x100000 sparse matrix of type '<type 'numpy.float64'>'
with 1244024860 stored elements in Compressed Sparse Row format>
My question is: how do I get Python to return the number 1244024860
to me? I want to use this number as an approximation to the number of nonzero elements (even though some of the stored elements could be zeros).
For smaller matrices I was using the sparse_mat.count_nonzero()
method but this method actually does computations (I guess that it checks that the stored elements are actually different from zero) and therefore it is very inefficient for my large matrix.
Use the nnz
attribute. For example,
In [80]: a = csr_matrix([[0, 1, 2, 0], [0, 0, 0, 0], [0, 0, 0, 3]])
In [81]: a
Out[81]:
<3x4 sparse matrix of type '<class 'numpy.int64'>'
with 3 stored elements in Compressed Sparse Row format>
In [82]: a.nnz
Out[82]: 3
The attributes of the csr_matrix
class are described in the csr_matrix
documentation (scroll down to find them).