Similar to How do I select a subset of a DataFrame based on one level of a MultiIndex, let
df = pd.DataFrame({"v":[x*x for x in range(12)]},
index=pd.MultiIndex.from_product([["a","b","c"],[1,2,3,4]]))
and suppose I want to select only rows with the v
being within 25 from its smallest value for the given first level:
v
a 1 0
2 1
3 4
4 9
b 1 16
2 25
3 36
c 1 64
2 81
This time I have no idea how to do that easily....
You can do groupby the level 0 of the dataframe and get the minimum value of v
column in each group. Then make a comparison between the v
and the smallest v
in each group.
out = df[df['v'].sub(df.groupby(level=0)['v'].transform('min')) < 25]
print(out)
v
a 1 0
2 1
3 4
4 9
b 1 16
2 25
3 36
c 1 64
2 81
If you want to find the min within level 1 of multiindex, you can do
out = df[((df.index.get_level_values(level=-1) -
df.reset_index(level=-1).groupby(level=0)['level_1'].transform('min'))
# or without reset index
# df.groupby(level=0).transform(lambda g: g.index.get_level_values(level=-1).min())
< 25).values]