I have a dataframe like this:
SeqNumber X Y Z
0 12 4 5 5
1 12 7 5 -8
2 13 10 2 1
3 16 4 8 7
...
I would like to identify the corresponding SeqNumbers to a positive Z value in a sample between a X_min, X_max and Y_min, Y_max and then just keep those SeqNumbers on the whole dataframe. How can I do that by using .loc?
If I define x_min = 3, x_max = 8, y_min = 4 and y_max = 6. Only the first 2 lines would be selected. Then of those lines, just the first one has a positive Z. So to end my problem I would like to maintain all the rows with the SeqNumber of the first line (the one who was selected before). With that the code would result with a dataframe with the first 2 lines of the original
Compute x_min, x_max, y_min, y_max with agg
and search rows that match your conditions:
x_min = 3
x_max = 8
y_min = 4
y_max = 6
idx = df.loc[df['Z'].gt(0) & df['X'].between(x_min, y_max)
& df['Y'].between(y_min, y_max),
'SeqNumber'].values
out = df.loc[df['SeqNumber'].isin(idx)]
Output:
>>> idx
array([12])
>>> out
SeqNumber X Y Z
0 12 4 5 5
1 12 7 5 -8