Search code examples
pythonnumpyscipysparse-matrixjax

inserting rows and columns of zeros to a sparse array in python


I have 50ish relatively large sparse arrays (in scipy.csr_array format but that can be changed) and I would like to insert rows and columns of zeros at certain locations. An example in dense format would look like:

A = np.asarray([[1,2,1],[2,4,5],[2,1,6]])
# A = array([[1,2,1],
#            [2,4,5],
#            [2,1,6]])
indices = np.asarray([-1, -1, 2, -1, 4, -1, -1, 7, -1])

# indices =  array([-1, -1, 2, -1, 4, -1, -1, 7, -1])
#insert rows and colums of zeros where indices[i] == -1 to get B

B = np.asarray([[0,0,0,0,0,0,0,0,0],
                [0,0,0,0,0,0,0,0,0],
                [0,0,1,0,2,0,0,1,0],
                [0,0,0,0,0,0,0,0,0],
                [0,0,2,0,4,0,0,5,0],
                [0,0,0,0,0,0,0,0,0],
                [0,0,0,0,0,0,0,0,0],
                [0,0,2,0,1,0,0,6,0],
                [0,0,0,0,0,0,0,0,0]])

A is a sparse array of shape (~2000, ~2000) with ~20000 non zero entries and indices is of shape (4096, ). I can imagine doing it in dense format but I guess I don't know enough about the way data and indices are are stored and cannot find a way to do this sort of operation for sparse arrays in a quick and efficient way.

Anyone have any ideas or suggestions?

Thanks.


Solution

  • I would probably do this by passing the data and associated indices into a COO matrix constructor:

    import numpy as np
    from scipy.sparse import coo_matrix
    
    A = np.asarray([[1,2,1],[2,4,5],[2,1,6]])
    indices = np.asarray([-1, -1, 2, -1, 4, -1, -1, 7, -1])
    
    idx = indices[indices >= 0]
    col, row = np.meshgrid(idx, idx)
    
    mat = coo_matrix((A.ravel(), (row.ravel(), col.ravel())),
                     shape=(len(indices), len(indices)))
    print(mat)
    #   (2, 2)  1
    #   (2, 4)  2
    #   (2, 7)  1
    #   (4, 2)  2
    #   (4, 4)  4
    #   (4, 7)  5
    #   (7, 2)  2
    #   (7, 4)  1
    #   (7, 7)  6
    
    print(mat.todense())
    # [[0 0 0 0 0 0 0 0 0]
    #  [0 0 0 0 0 0 0 0 0]
    #  [0 0 1 0 2 0 0 1 0]
    #  [0 0 0 0 0 0 0 0 0]
    #  [0 0 2 0 4 0 0 5 0]
    #  [0 0 0 0 0 0 0 0 0]
    #  [0 0 0 0 0 0 0 0 0]
    #  [0 0 2 0 1 0 0 6 0]
    #  [0 0 0 0 0 0 0 0 0]]