For creating a scipy sparse matrix, I have an array or row and column indices I
and J
along with a data array V
. I use those to construct a matrix in COO format and then convert it to CSR,
matrix = sparse.coo_matrix((V, (I, J)), shape=(n, n))
matrix = matrix.tocsr()
I have a set of row indices for which the only entry should be a 1.0
on the diagonal. So far, I go through I
, find all indices that need wiping, and do just that:
def find(lst, a):
# From <http://stackoverflow.com/a/16685428/353337>
return [i for i, x in enumerate(lst) if x in a]
# wipe_rows = [1, 55, 32, ...] # something something
indices = find(I, wipe_rows) # takes too long
I = numpy.delete(I, indices).tolist()
J = numpy.delete(J, indices).tolist()
V = numpy.delete(V, indices).tolist()
# Add entry 1.0 to the diagonal for each wipe row
I.extend(wipe_rows)
J.extend(wipe_rows)
V.extend(numpy.ones(len(wipe_rows)))
# construct matrix via coo
That works alright, but find
tends to take a while.
Any hints on how to speed this up? (Perhaps wiping the rows in COO or CSR format is a better idea.)
If you intend to clear multiple rows at once, this
def _wipe_rows_csr(matrix, rows):
assert isinstance(matrix, sparse.csr_matrix)
# delete rows
for i in rows:
matrix.data[matrix.indptr[i]:matrix.indptr[i+1]] = 0.0
# Set the diagonal
d = matrix.diagonal()
d[rows] = 1.0
matrix.setdiag(d)
return
is by far the fastest method. It doesn't really remove the lines, but sets all entries to zeros, then fiddles with the diagonal.
If the entries are actually to be removed, one has to do some array manipulation. This can be quite costly, but if speed is no issue: This
def _wipe_row_csr(A, i):
'''Wipes a row of a matrix in CSR format and puts 1.0 on the diagonal.
'''
assert isinstance(A, sparse.csr_matrix)
n = A.indptr[i+1] - A.indptr[i]
assert n > 0
A.data[A.indptr[i]+1:-n+1] = A.data[A.indptr[i+1]:]
A.data[A.indptr[i]] = 1.0
A.data = A.data[:-n+1]
A.indices[A.indptr[i]+1:-n+1] = A.indices[A.indptr[i+1]:]
A.indices[A.indptr[i]] = i
A.indices = A.indices[:-n+1]
A.indptr[i+1:] -= n-1
return
replaces a given row i
of the matrix matrix
by the entry 1.0
on the diagonal.