Search code examples
pythonscipysparse-matrix

Scipy CSR Matrix Element-wise Addition


Unlike numpy arrays/matrices, CSR matrix seems not to allow automatic broadcasting. There are methods in the CSR implementation for element-wise multiplication, but not addition. How to add to a CSR Sparse matrix by a scalar efficiently?


Solution

  • Here we want to add a scalar to the non-zero entries and leave alone the matrix sparseness, i.e. do not touch the zero entries.


    From the fine Scipy docs (** emphasis ** is mine):

    Attributes
    
    nnz                   Get the count of explicitly-stored values (nonzeros)  
    has_sorted_indices    Determine whether the matrix has sorted indices  
    dtype (dtype)         Data type of the matrix  
    shape (2-tuple)       Shape of the matrix  
    ndim  (int)           Number of dimensions (this is always 2)  
    **data                CSR format data array of the matrix** 
    indices               CSR format index array of the matrix  
    indptr                CSR format index pointer array of the matrix
    

    So I tried (the first part is "stolen" from the referenced documentation)

    In [18]: from scipy import *
    
    In [19]: from scipy.sparse import *
    
    In [20]: row = array([0,0,1,2,2,2])
        ...: col = array([0,2,2,0,1,2])
        ...: data =array([1,2,3,4,5,6])
        ...: a = csr_matrix( (data,(row,col)), shape=(3,3))
        ...: 
    
    In [21]: a.todense()
    Out[21]: 
    matrix([[1, 0, 2],
            [0, 0, 3],
            [4, 5, 6]], dtype=int64)
    
    In [22]: a.data += 10
    
    In [23]: a.todense()
    Out[23]: 
    matrix([[11,  0, 12],
            [ 0,  0, 13],
            [14, 15, 16]], dtype=int64)
    
    In [24]: 
    

    It works. Should you save the original matrix you can use the constructor using a modified data array.


    Disclaimer

    This answer addresses this interpretation of the question

    I have a sparse matrix, I want to add a scalar to the non zero entries, preserving the sparseness both of the matrix and of its programmatic representation.

    My reasoning for choosing this interpretation is that adding a scalar to all entries turns the sparse matrix in a VERY dense matrix...

    If this is the correct interpretation, I don't know: on one hand the OP approved my answer (at least today 2017-07-13) on the other hand in the comments beneath their question it seems they has of a different opinion.

    The answer is however useful in the use case that the sparse matrix represents, e.g., sparse measurements and you want to correct a measurement bias, subtract a mean value, etc. so I'm going to leave it here, even if it can be judged controversial.