I have two scipy.sparse.csr_matrices with identical non-zero locations. I was assuming that their indptr and indices arrays would be identical. But it turns out only their indptr arrays are identical (and sorted). The indices array, which stores the columns for any row, are permutations for any row.
In the image above, let A = sum_predictions and B = Yt_preds[4] be two matrices. Both the matrices have identical non-zero locations. We see that their indptr arrays (number of nonzeros in each row) are identical. However the indices arrays (columns for any particular row) are permutations of each other.
How is the order of the indices array in scipy sparse matrix determined? How can I get a representation where two matrices with identical non-zero locations have the same indptr and indices arrays?
The solution is what CJR proposed: mat.sortindices()