Given a (3,2,2) array how do I get second dimension elements given a single value on the third dimension
import numpy as np
arr = np.array([
[[31., 1.], [41., 1.]],
[[63., 1.],[73., 3.]],
[[ 95., 1.], [100., 1]]
]
)
ref = arr[(arr[:,:,0] > 41.) & (arr[:,:,0] <= 63)]
print(ref)
Result
[[63. 1.]]
Expected result
[[63., 1.],[73., 3.]]
The input value is 63 so I don't know in advance 73 exists but I want to return it as well. In other words, if value exists return the whole parent array without reshaping.
Another example
ref = arr[(arr[:,:,0] <= 63)]
Returns
[[31. 1.]
[41. 1.]
[63. 1.]]
But should return
[[[31. 1.]
[41. 1.]]
[[63. 1.]
[73. 1.]]]
I think you want
arr[((arr[:,:,0]>41)&(arr[:,:,0]<=63)).any(axis=1)]
and
arr[(arr[:,:,0] <= 63).any(axis=1)]
Some explanation. First of all, a 3D array, is also a 2D array of 1D array, or a 1D array of 2D array.
So, if the expected answer is an array of "whole parent", that is an array of 2D arrays (so a 3D array, with only some subarrays in it), you need a 1D array of booleans as index. Such as [True, False, False]
to select only the 1st row, [[[31., 1.], [41., 1.]]]
, which is a bit the same as arr[[0]]
.
arr[:,:,0]>41
is a 2D array of booleans. And therefore would pick individually some pairs (that is select elements along the 1st two axis).
For example arr[[[True, False], [False, False], [False, False]]]
selects only the 1st pair of the 1st subarray, [[31,1]]
. A bit like arr[[0],[0]]
would do.
So, since you want something like [[[31., 1.], [41., 1.]]]
, not something like [[31,1]]
, we need to produce a 1D array of booleans, telling for each line (each subarray along axis 0) whether we want it or not.
Now, the comments were about how to decide whether we want a subarray or not.
If we start from ref=arr[:,:,0]<=63
That is ref =
array([[ True, True],
[ True, False],
[False, False]])
getting arr[ref]
would select 3 pairs, which is not what you want (again, we don't want a 2D array of booleans as selector, since we want whole parents).
Your attempt answer, is to use ref[:,0]
, which is a 1D-array of 3 booleans (the first of each row) : [True, True, False]
, which would select the 3 first rows.
This answer, is to use ref[:,0]
, which is also a 1D-array of 3 booleans, each True iff one boolean at least of the row is True. So, also [True, True, False]
Difference between our two answers shows with another example, used in comment.
ref=arr[:,:,0]>97
array([[False, False],
[False, False],
[False, True]])
if we use ref[:,0]
, that is [False, False, False]
, then answer is an empty array. Even the last row is not selected, tho it contains a value over 97. But we are only interested (if we say ref[:,0]
) in rows whose 1st value of 1st pair is > 97
If we use ref.any(axis=1)
, as in this answer, that is [False, False, True]
, we get the last row. Because this means that we are interested in any row whose at least one pair has a 1st value>97.
We could also select rows whose only second pair has a 1st value>97 (arr[ref[:,1]]
). Or rows whose one pair, but not both, has a 1st value>97 (arr[ref[:,0]^ref[:,1]]
). Etc. Everything is possible. Point is, if we want to get a list of whole rows (subarrays along axis 0), then we need to build a 1D array of 3 booleans, deciding for each row if we want it all (True) or nothing (false)