I would like to create new columns that tag the nsmallest and nlargest values of a column values per group. How can I do that efficiently? for instance for the two lowest and highest periods, I would like to obtain:
Many thanks
data = {'group': ['A', 'A', 'A','A', 'B', 'B', 'B','B'],
'values': [10,5,5,5, 7, 7,7, 2]}
df :
group value tag 2 lowest/group tag 2 highest/group
1 A 10 0 1
2 A 5 1 1
3 A 5 1 0
4 A 5 0 0
5 B 7 1 1
6 B 7 0 1
7 B 7 0 0
8 B 2 1 0
I have found a solution, do not know if there is more efficient though:
df['tag 2 lowest/group'] = (df.groupby('group')['values'].transform(lambda x: x.nsmallest(2,keep="first")) > 0)*1