Search code examples
pythonnumpyscipysparse-matrix

scipy sparse matrix division


I have been trying to divide a python scipy sparse matrix by a vector sum of its rows. Here is my code

sparse_mat = bsr_matrix((l_data, (l_row, l_col)), dtype=float)
sparse_mat = sparse_mat / (sparse_mat.sum(axis = 1)[:,None])

However, it throws an error no matter how I try it

sparse_mat = sparse_mat / (sparse_mat.sum(axis = 1)[:,None])
File "/usr/lib/python2.7/dist-packages/scipy/sparse/base.py", line 381, in __div__
return self.__truediv__(other)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/compressed.py", line 427, in __truediv__
raise NotImplementedError
NotImplementedError

Anyone with an idea of where I am going wrong?


Solution

  • You can circumvent the problem by creating a sparse diagonal matrix from the reciprocals of your row sums and then multiplying it with your matrix. In the product the diagonal matrix goes left and your matrix goes right.

    Example:

    >>> a
    array([[0, 9, 0, 0, 1, 0],
           [2, 0, 5, 0, 0, 9],
           [0, 2, 0, 0, 0, 0],
           [2, 0, 0, 0, 0, 0],
           [0, 9, 5, 3, 0, 7],
           [1, 0, 0, 8, 9, 0]])
    >>> b = sparse.bsr_matrix(a)
    >>> 
    >>> c = sparse.diags(1/b.sum(axis=1).A.ravel())
    >>> # on older scipy versions the offsets parameter (default 0)
    ... # is a required argument, thus
    ... # c = sparse.diags(1/b.sum(axis=1).A.ravel(), 0)
    ...
    >>> a/a.sum(axis=1, keepdims=True)
    array([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       ,  0.        ],
           [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        ,  0.5625    ],
           [ 0.        ,  1.        ,  0.        ,  0.        ,  0.        ,  0.        ],
           [ 1.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
           [ 0.        ,  0.375     ,  0.20833333,  0.125     ,  0.        ,  0.29166667],
           [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,  0.        ]])
    >>> (c @ b).todense() # on Python < 3.5 replace c @ b with c.dot(b)
    matrix([[ 0.        ,  0.9       ,  0.        ,  0.        ,  0.1       ,  0.        ],
            [ 0.125     ,  0.        ,  0.3125    ,  0.        ,  0.        ,  0.5625    ],
            [ 0.        ,  1.        ,  0.        ,  0.        ,  0.        ,  0.        ],
            [ 1.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
            [ 0.        ,  0.375     ,  0.20833333,  0.125     ,  0.        ,  0.29166667],
            [ 0.05555556,  0.        ,  0.        ,  0.44444444,  0.5       ,  0.        ]])