I have a large (90k x 90k) numpy ndarray and I need to zero out a block of it. I have a list of about 30k indices that indicate which rows and columns need to be zero. The indices aren't necessarily contiguous, so a[min:max, min:max]
style slicing isn't possible.
As a toy example, I can start with a 2D array of non-zero values, but I can't seem to write zeros the way I expect.
import numpy as np
a = np.ones((6, 8))
indices = [2, 3, 5]
# I thought this would work, but it does not.
# It correctly writes to (2,2), (3,3), and (5,5), but not all
# combinations of (2, 3), (2, 5), (3, 2), (3, 5), (5, 2), or (5, 3)
a[indices, indices] = 0.0
print(a)
[[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 0. 1. 1. 1. 1. 1.]
[1. 1. 1. 0. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 0. 1. 1.]]
# I thought this would fix that problem, but it doesn't change the array.
a[indices, :][:, indices] = 0.0
print(a)
[[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]]
In this toy example, I'm hoping for this result.
[[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 0. 0. 1. 0. 1. 1.]
[1. 1. 0. 0. 1. 0. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 0. 0. 1. 0. 1. 1.]]
I could probably write a cumbersome loop or build some combinatorically huge list of indices to do this, but it seems intuitive that this must be supported in a cleaner way, I just can't find the syntax to make it happen. Any ideas?
Based on hpaulj's comment, I came up with this, which works perfectly on the toy example.
a[np.ix_(indices, indices)] = 0.0
print(a)
[[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 0. 0. 1. 0. 1. 1.]
[1. 1. 0. 0. 1. 0. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 0. 0. 1. 0. 1. 1.]]
It also worked beautifully on the real data. It was faster than I expected and didn't noticeably increase memory consumption. Exhausting memory has been a constant concern with these giant arrays.