I have two arrays, one of floats and one of integers
arr1 = np.asarray([[1.5, 0.75, 0.2],
[0.3, 1.8, 4.2]])
arr2 = np.asarray([2, 1])
I need to amend arr1 such that
arr1 [0, arr2[0]:] = 0
arr1 [1, arr2[1]:] = 0
obtaining
array([[1.5 , 0.75, 0. ],
[0.3 , 0. , 0. ]])
i.e. the nth element of arr2
dictates the number of non-zero elements in row n of arr1
.
How can I vectorise this for large arrays, and extend this to further dimensions? Say
rng = np.random.default_rng()
arr1 = rng.uniform(size=(100, 10, 1000))
arr2 = rng.integers(10, size=(100, 1000))
how can I change arr1
(or create a new arr3
depending on implementation) such that every slice
arr1[:, :, i], arr2[:, i]
follows the above?
The final implementation will be on the order of 100,000 x 100 x 10,000.
I think the trick would be to create a mask using arr2
and an np.arange
of numbers representing indices. In your toy example:
>>> arr2[:, None] > np.arange(3)
array([[ True, True, False],
[ True, False, False]])
which creates a mask of which values to keep, and which values to replace with zeroes. You can then: arr3 = np.where(mask, arr1, 0)
, or you can directly modify arr1
in place.
In your final example, I'm not entirely sure along which dimension you want to replace slices by zeroes, or what the final order means. So here is the trick in general:
# Let's say we want to replace slices along the 'D' axis.
arr1 = rng.uniform(size=(A, B, C, D, E, F))
# The slice indices only make sense if we choose integers that are smaller than D
arr2 = rng.integers(D, size=(A, B, C, E, F))
# Need to insert an empty axis in `arr2` with respect to the dimension in `arr1`
# along which you want to replace the slices.
# We do the opposite for the `np.arange` indices.
# That's because the broadcasting plays out nicely.
mask = arr2[:, :, :, None, :, :] > np.arange(D)[None, None, None, :, None, None]
arr3 = np.where(mask, arr1, 0)