Here's the current implementation:
def nonzero_indexes_by_row(input):
return [
np.nonzero(row)[1]
for row in csr_matrix(input.T)
]
The matrix is very large(1.5M, 500K), since I'm accessing rows, I have to convert CSC to CSR first. The result would be a 2d list, each list contains a list of indexes that are nonzero corresponding to the row in the original matrix.
The current process takes 20 minutes. Is there a faster way?
Sure. You're pretty close to having an ideal solution, but you're allocating some unnecessary arrays. Here's a faster way:
from scipy import sparse
import numpy as np
def my_impl(csc):
csr = csc.tocsr()
return np.split(csr.indices, csr.indptr[1:-1])
def your_impl(input):
return [
np.nonzero(row)[1]
for row in sparse.csr_matrix(input)
]
## Results
# demo data
csc = sparse.random(15000, 5000, format="csc")
your_result = your_impl(csc)
my_result = my_impl(csc)
## Tests for correctness
# Same result
assert all(np.array_equal(x, y) for x, y in zip(your_result, my_result))
# Right number of rows
assert len(my_result) == csc.shape[0]
## Speed
%timeit my_impl(csc)
# 31 ms ± 1.26 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit your_impl(csc)
# 1.49 s ± 19.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Side question, why are you transposing the matrix? Wouldn't you then be getting the non-zero entries of the columns? If that's what you want, you don't even need to convert to csr
and can just run:
np.split(csc.indices, csc.indptr[1:-1])