Search code examples
pythonscipysparse-matrix

Compressed Sparse Row, indptr has two same values


indptr points to row starts in indices and data. I have transformed my matrix into csr matrix by np.savez(). However, I noticed that the first elements of indptr is as follows:

1
1
23
195
213
256
284
317

which says that the first row and the second row start with the same data. What causes this error, or is this an error?


Solution

  • It means that the 2nd row is all zeros

    In [187]: from scipy import sparse
    In [191]: M=sparse.csr_matrix([[0,0,1],[0,0,0],[0,1,0],[1,1,0]])
    In [192]: M.A
    Out[192]: 
    array([[0, 0, 1],
           [0, 0, 0],
           [0, 1, 0],
           [1, 1, 0]], dtype=int32)
    In [193]: M.indptr
    Out[193]: array([0, 1, 1, 2, 4], dtype=int32)
    

    (though the missing 0 at the start of indptr is a bit of a concern.)

    What does the .A (toarray()) show?