I have a numpy array like:
array([[ 1, 17, 33, ..., 28, 9, 22],
[ 3, 11, 1, ..., 25, 45, 14],
[ 3, 11, 1, ..., 21, 23, 5],
...,
[20, 6, 27, ..., 43, 15, 14],
[27, 6, 39, ..., 37, 17, 2],
[ 3, 11, 8, ..., 27, 35, 32]], dtype=int32)
From here, I would like to filter out rows where value 4 is occurring at index of 10 or before.
e.g. [1, 2, 3, 4, ..., 34, 35] - filter, as value 4 is occurring at index3, which is before index 10
e.g. [35, 34, 33, 32..., 4, 3, 2, 1] - keep, as value 4 is occurring after index10.
what would be the way to achieve this filtering using numpy masking?
You can try this :
arr = np.array(
[[1, 2, 3, 4, 34, 35, 2, 1],
[1, 2, 3, 5, 6, 7, 8, 9],
[1, 2, 4, 5, 6, 7, 8, 9],
[4, 2, 3, 5, 6, 7, 8, 9],
[35, 34, 33, 32, 4, 3, 2, 1]]
)
V, T = 4, 3 # <-- change the threshold to 10
m = np.any(arr[:, :T+1] == V, axis=1)
out = arr[~m]
Output :
print(out)
array([[ 1, 2, 3, 5, 6, 7, 8, 9],
[35, 34, 33, 32, 4, 3, 2, 1]])
Intermediates :
>>> arr[:, :T+1]
array([[ 1, 2, 3, 4],
[ 1, 2, 3, 5],
[ 1, 2, 4, 5],
[ 4, 2, 3, 5],
[35, 34, 33, 32]])
>>> arr[:, :T+1] == V
array([[False, False, False, True],
[False, False, False, False],
[False, False, True, False],
[ True, False, False, False],
[False, False, False, False]])
>>> np.any(arr[:, :T+1] == V, axis=1)
array([ True, False, True, True, False])