Search code examples
pandasgroup-by

tag nsmallest and nlargest values in a column in a groupby


I would like to create new columns that tag the nsmallest and nlargest values of a column values per group. How can I do that efficiently? for instance for the two lowest and highest periods, I would like to obtain:

Many thanks

data = {'group': ['A', 'A', 'A','A', 'B', 'B', 'B','B'],
        'values': [10,5,5,5, 7, 7,7, 2]}
 

    df :
          group  value  tag 2 lowest/group   tag 2 highest/group   
        1  A      10       0                      1
        2  A      5        1                      1
        3  A      5        1                      0
        4  A      5        0                      0
        5  B      7        1                      1
        6  B      7        0                      1
        7  B      7        0                      0
        8  B      2        1                      0 
  

Solution

  • I have found a solution, do not know if there is more efficient though:

    df['tag 2 lowest/group'] = (df.groupby('group')['values'].transform(lambda x: x.nsmallest(2,keep="first")) > 0)*1