I have this example dataset:
What I'm trying to do is to see which categories on ID column have values that are strictly higher than 45, while showing me the others that aren't. So it should tell me that IDs 'a' and 'd' match my criteria, while 'b' and 'c' are out of it. Afterwards, I'll drop the rows 'b' and 'c'
What's the simplest way of doing that?
I tried
def filter_func(x):
return x['vals']>45
df.groupby('id').filter(filter_func)
df['id'].unique()
but I get this error:
filter function returned a Series, but expected a scalar bool
You can try this way :
df2 = df.groupby('id').min().reset_index()
df2.loc[df2['vals'] > 45]['id']