I have np.array and mask which are of the same shape. Once I apply the mask, the array loses it shape and becomes 1D - flattened one dimensional.
I am wanting to reduce my array across some axis, based on a mask of axis length 1D.
How can I apply a mask, but keep dimensionality of the array?
A small example in code:
# data ...
>>> data = np.ones((4, 4))
>>> data.shape
(4, 4)
# mask ...
>>> mask = np.ones((4, 4), dtype=bool)
>>> mask.shape
(4, 4)
# apply mask ...
>>> data[mask].shape
(16,)
My ideal shape would be (4, 4)
.
An example with array dimension reduction across an axis:
# data, mask ...
>>> data = np.ones((4, 4))
>>> mask = np.ones((4, 4), dtype=bool)
# remove last column from data ...
>>> mask[:, 3] = False
>>> mask
array([[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False]])
# equivalent mask in 1D ...
>>> mask[0]
array([ True, True, True, False])
# apply mask ...
>>> data[mask].shape
(12,)
The ideal dimensions of the array would be (4, 3)
without reshape.
Help is appreciated, thanks!
The 'correct' way of achieving your goal is to not expand the mask to 2D. Instead index with [:, mask]
with the 1D mask. This indicates to numpy that you want axis 0 unchanged and mask
applied along axis 1.
a = np.arange(12).reshape(3, 4)
b = np.array((1,0,1,0),'?')
a
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]])
b
# array([ True, False, True, False])
a[:, b]
# array([[ 0, 2],
# [ 4, 6],
# [ 8, 10]])
If your mask
is already 2D, numpy won't check whether all its rows are the same because that would be inefficient. But obviously you can use [:, mask[0]]
in that case.
If your mask
is 2D and just happens to have the same number of True
s in each row then either use @tel's answer. Or create an index array:
B = b^b[:3, None]
B
# array([[False, True, False, True],
# [ True, False, True, False],
# [False, True, False, True]])
J = np.where(B)[1].reshape(len(B), -1)
And now either
np.take_along_axis(a, J, 1)
# array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])
or
I = np.arange(len(J))[:, None]
IJ = I, J
a[IJ]
# #array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])