Search code examples
numpyscipysparse-matrix

unexpected outcome caused by scipy.csr_matrix


>>> import numpy as np
>>> from scipy.sparse import *
>>> x1 = np.eye(3, dtype=float)
>>> x2 = csr_matrix(x1, dtype=float, shape =x1.shape)
>>> assert x2.todense().any()==x1.any()  ## holds true
>>> w = np.ones((3,1))
>>> dw1 = w - x1[:,0]
>>> dw2 = w - x2[:,0]

which gives me

>>> print dw1
[[ 0.  1.  1.]
 [ 0.  1.  1.]
 [ 0.  1.  1.]]

while

>>> print dw2
[[ 0.]
 [ 1.]
 [ 1.]]

My question is why dw1 and dw2 differ? Should they defer, Is it a bug? Many Thanks!


Solution

  • This is a slicing/indexing issue. The questionable line here is

    w - x1[:, 0]
    

    which has nothing to do with sparseness. You have sliced x1, obtaining a 1D array. When this gets subtracted from w, numpy broadcasts this array back into a 3 by 3 matrix (because it equates the number of columns of both terms), which I guess is not what you wanted.

    It looks like you just wanted the submatrix consisting of the first column of x1. This would be

    w - x1[:, [0]]
    

    returning

    array([[ 0.],
           [ 1.],
           [ 1.]])
    

    consistent with the other result.

    In case of a sparse matrix, you automatically get a submatrix (not a 1D array) because indexing works differently for those.