Search code examples
pythonnumpyscipysparse-matrix

How to define a (n, 0) sparse matrix in scipy or how to assemble a sparse matrix column wise?


I have a loop that in each iteration gives me a column c of a sparse matrix N.

To assemble/grow/accumulate N column by column I thought of using

N = scipy.sparse.hstack([N, c]) 

To do this it would be nice to initialize the matrix with with rows of length 0. However,

N = scipy.sparse.csc_matrix((4,0))

raises a ValueError: invalid shape.

Any suggestions, how to do this right?


Solution

  • You can't. Sparse matrices are restricted compared to NumPy arrays and in particular don't allow 0 for any axis. All sparse matrix constructors check for this, so if and when you do manage to build such a matrix, you're exploiting a SciPy bug and your script is likely to break when you upgrade SciPy.

    That being said, I don't see why you'd need an n × 0 sparse matrix since an n × 0 NumPy array is allowed and takes practically no storage space.

    Turns out sparse.hstack cannot handle a NumPy array with a zero axis, so disregard my previous comment. However, what I think you should do is collect all the columns in a list, then hstack them in one call. That's better than your loop since append'ing to a list takes amortized constant time, while hstack takes linear time. So your proposed algorithm takes quadratic time while it could be linear.