If I try to slice a sparse matrix or see the value at a given [row,colum]
, I get a IndexError
More precisely, I have the following scipy.sparse.csr_matrix
which I load from a file after saving it
...
>>> A = scipy.sparse.csr_matrix((vals, (rows, cols)), shape=(output_dim, input_dim))
>>> np.save(open('test_matrix.dat', 'wb'), A)
...
>>> A = np.load('test_matrix.dat', allow_pickle=True)
>>> A
array(<831232x798208 sparse matrix of type '<class 'numpy.float32'>'
with 109886100 stored elements in Compressed Sparse Row format>,
dtype=object)
However, when I try to get the value at a given [row,column] pair, I get the following error
>>> A[1,1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: too many indices for array
Why is that happening?
Just to clarify, I'm sure that the matrix is not empty, as I can see its content if I do
>>> print(A)
(0, 1) 0.24914551
(0, 2) 0.6669922
(1, 1) 0.75097656
(1, 3) 0.6640625
(2, 3) 0.3359375
(2, 514) 0.34960938
...
When you save and reload your sparse array you have created an array with one entry; an object, being your sparse array. So A has nothing at [1,1]. You should use scipy.sparse.save_npz
instead.
For example:
import scipy.sparse as sps
import numpy as np
A = sps.csr_matrix((10,10))
A
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 0 stored elements in Compressed Sparse Row format>
np.save('test_matrix.dat', A)
B = np.load('test_matrix.dat.npy', allow_pickle=True)
B
array(<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 0 stored elements in Compressed Sparse Row format>, dtype=object)
B[1,1]
IndexError Traceback (most recent call last)
<ipython-input-101-969f8bd5206a> in <module>
----> 1 B[1,1]
IndexError: too many indices for array
sps.save_npz('sparse_dat')
C = sps.load_npz('sparse_dat.npz')
C
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 0 stored elements in Compressed Sparse Row format>
C[1,1]
0.0
Mind you you can still retrieve A
from B
like so:
D = B.tolist()
D
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 0 stored elements in Compressed Sparse Row format>
D[1,1]
0.0