Search code examples
pythonnumpyscipysparse-matrix

element-wise operations on a sparse matrix


If you have a sparse matrix X:

>> print type(X)
<class 'scipy.sparse.csr.csr_matrix'>

...How can you sum the squares of each element in each row, and save them into a list? For example:

>>print X.todense()
[[0 2 0 2]
 [0 2 0 1]]

How can you turn that into a list of sum of squares of each row:

[[0²+2²+0²+2²]
 [0²+2²+0²+1²]]

or: [8, 5]


Solution

  • First of all, the csr matrix has a .sum method (relying on the dot product) which works well, so what you need is the squaring. The simplest solution is to create a copy of the sparse matrix, square its data and then sum it:

    squared_X = X.copy()
    # now square the data in squared_X
    squared_X.data **= 2
    
    # and sum each row:
    squared_sum = squared_X.sum(1)
    # and delete the squared_X:
    del squared_X
    

    If you really must save the space, I guess you could just replace .data and then replace it back, something along:

    X.sum_duplicate() # make sure, not sure if this happens with normal usage.
    old_data = X.data.copy()
    X.data **= 2
    squared_sum = X.sum(1)
    X.data = old_data
    

    EDIT: There is actually another nice way, as the csr matrix has a .multiply method for elementwise multiplication:

    squared_sum = X.multiply(X).sum(1)
    

    Addition: Elementwise operations are thus easily done by accessing csr.data which stores the values for all nonzero elements. NOTE: I guess .sum_duplicates() may be necessary, I am not sure what kind of operations would make it necessary.