I have 50ish relatively large sparse arrays (in scipy.csr_array
format but that can be changed) and I would like to insert rows and columns of zeros at certain locations. An example in dense format would look like:
A = np.asarray([[1,2,1],[2,4,5],[2,1,6]])
# A = array([[1,2,1],
# [2,4,5],
# [2,1,6]])
indices = np.asarray([-1, -1, 2, -1, 4, -1, -1, 7, -1])
# indices = array([-1, -1, 2, -1, 4, -1, -1, 7, -1])
#insert rows and colums of zeros where indices[i] == -1 to get B
B = np.asarray([[0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0],
[0,0,1,0,2,0,0,1,0],
[0,0,0,0,0,0,0,0,0],
[0,0,2,0,4,0,0,5,0],
[0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0],
[0,0,2,0,1,0,0,6,0],
[0,0,0,0,0,0,0,0,0]])
A
is a sparse array of shape (~2000, ~2000) with ~20000 non zero entries and indices
is of shape (4096, ). I can imagine doing it in dense format but I guess I don't know enough about the way data and indices are are stored and cannot find a way to do this sort of operation for sparse arrays in a quick and efficient way.
Anyone have any ideas or suggestions?
Thanks.
I would probably do this by passing the data and associated indices into a COO matrix constructor:
import numpy as np
from scipy.sparse import coo_matrix
A = np.asarray([[1,2,1],[2,4,5],[2,1,6]])
indices = np.asarray([-1, -1, 2, -1, 4, -1, -1, 7, -1])
idx = indices[indices >= 0]
col, row = np.meshgrid(idx, idx)
mat = coo_matrix((A.ravel(), (row.ravel(), col.ravel())),
shape=(len(indices), len(indices)))
print(mat)
# (2, 2) 1
# (2, 4) 2
# (2, 7) 1
# (4, 2) 2
# (4, 4) 4
# (4, 7) 5
# (7, 2) 2
# (7, 4) 1
# (7, 7) 6
print(mat.todense())
# [[0 0 0 0 0 0 0 0 0]
# [0 0 0 0 0 0 0 0 0]
# [0 0 1 0 2 0 0 1 0]
# [0 0 0 0 0 0 0 0 0]
# [0 0 2 0 4 0 0 5 0]
# [0 0 0 0 0 0 0 0 0]
# [0 0 0 0 0 0 0 0 0]
# [0 0 2 0 1 0 0 6 0]
# [0 0 0 0 0 0 0 0 0]]