Search code examples
pythonnumpyscipysparse-matrix

SciPy sparse matrix not modified when passed into function


I have noticed an apparent inconsistency in how SciPy sparse matrices and numpy arrays are modified when passed into functions. In particular, I was wondering if someone could explain why the a sparse matrix below is not globally modified by func, but the b array is:

from scipy import sparse
import numpy as np

def func(m):
    m += m

a = sparse.identity(2)
b = np.array([1, 2])

print(a.todense()) # [[1,0],[0,1]]
func(a)
print(a.todense()) # Still [[1,0],[0,1]]. Why???

print(b) # [1, 2]
func(b)
print(b) # Now [2, 4]

Solution

  • In [11]: arr = np.array([[1,0],[2,3]])
    In [12]: id(arr)
    Out[12]: 1915221691344
    
    In [13]: M = sparse.csr_matrix(arr)
    In [14]: id(M)
    Out[14]: 1915221319840
    
    In [15]: arr += arr
    
    In [16]: id(arr)
    Out[16]: 1915221691344
    

    += operates in-place for array.

    In [17]: M += M    
    In [18]: id(M)
    Out[18]: 1915221323200
    

    For the sparse matrix it creates a new sparse matrix object. It doesn't modify the matrix in-place.

    For this operation, the data attribute could be modified in place:

    In [20]: M.data
    Out[20]: array([2, 4, 6], dtype=int32)
    
    In [21]: M.data += M.data
    
    In [22]: M.A
    Out[22]: 
    array([[ 4,  0],
           [ 8, 12]], dtype=int32)
    

    But in general, adding something to a sparse matrix can modify its sparsity. The sparse developers, in their wisdom, decided it wasn't possible, or just not cost effective (programming or run time?) to do this without creating a new matrix.

    While a sparse matrix is patterned on the np.matrix subclass, it is not a subclass of ndarray, and is not obligated to behave in exactly the same way.

    In [30]: type(M).__mro__
    Out[30]: 
    (scipy.sparse.csr.csr_matrix,
     scipy.sparse.compressed._cs_matrix,
     scipy.sparse.data._data_matrix,
     scipy.sparse.base.spmatrix,
     scipy.sparse.data._minmax_mixin,
     scipy.sparse._index.IndexMixin,
     object)