I have two numpy arrays - arr1 and arr2. arr2 contains index values for arr1. The shape of arr1 is (100, 8, 96, 192) and the shape of arr2 is (8, 96, 192). What I would like do is set all of the values in arr1 to np.nan after the index values in arr2.
For context, arr1 is time x model x lat x lon and all the indexes values in arr2 correspond to a point in time in the arr1 array. I would like to set the arr1 values at after the point in time in arr2 to np.nan.
Sample Data
arr1 = np.random.rand(*(100, 8, 96, 192))
arr2 = np.random.randint(low=0, high=80,size=(8, 96, 192))
in: print(arr1)
out: array([[[[0.61718651, 0.24426295, 0.9165573 , ..., 0.24155022,
0.22327592, 0.9533857 ],
[0.21922781, 0.87948651, 0.926359 , ..., 0.64281931,
...,
[0.09342961, 0.29533331, 0.11398662, ..., 0.36239606,
0.40228814, 0.87284515]]]])
in: print(arr2)
out: array([[[22, 5, 64, ..., 0, 37, 6],
[71, 48, 33, ..., 8, 38, 32],
[15, 41, 61, ..., 56, 32, 48],
...,
...,
[66, 31, 32, ..., 0, 10, 6],
[ 9, 28, 72, ..., 71, 29, 34],
[65, 22, 50, ..., 58, 49, 35]]])
For reference I have previously asked this question which had some similarities. Numpy multi-dimensional index
Based upon this, I tried
arr1 = np.random.rand(100, 8, 96, 192)
arr2 = np.random.randint(low=0, high=80, size=(8, 96, 192))
I, J, K = np.indices((8, 96, 192), sparse=True)
out = arr1[arr2:, I, J, K]
TypeError: only integer scalar arrays can be converted to a scalar index
Also, perhaps similar to this in concept, but for much higher dimensional arrays Set values in numpy array to NaN by index
In this case, I would recommend indexing using a boolean mask with the same shape as arr1
. Integer array advanced indexing like in your previous question is a lot harder here because for each model x lat x lon, a variable number of elements need to be indexed. Example:
import numpy as np
arr1 = np.random.rand(*(100, 8, 96, 192))
arr2 = np.random.randint(low=0, high=80,size=(8, 96, 192))
# These are the possible indices along the first axis in arr1
# Hence shape (100, 1, 1, 1):
idx = np.arange(100)[:, None, None, None]
arr1[idx > arr2] = np.nan