Search code examples
pythonpandasgroup-bydata-sciencedata-munging

pandas how to get sorted value in groupby object


I have a dataframe

df = 
Col Val
 a.  8
 a.  9
 c.  4
 c.  0
 d.  3
 d.  9

I want to sort by Val of the smallest value within group and then foreach row get the index of the groupby Col So the new df will df

df_new = 
Col Val Idx
 c.  4.  0
 c.  0.  0
 d.  3.  1
 d.  9.  1
 a.  8.  2
 a.  9.  2

What is the best way to do so?


Solution

  • Use GroupBy.transform with min to new column, then for starting by 1 with range use Series.rank and then sorting by 2 columns - if same minimal value per multiple groups get same groups by Col together:

    df['Idx'] = (df.groupby('Col')['Val'].transform('min')
                   .rank(method='dense').astype(int).sub(1))
    df = df.sort_values(['Idx','Col'], ignore_index=True)
    print (df)
      Col  Val  Idx
    0  c.    4    0
    1  c.    0    0
    2  d.    3    1
    3  d.    9    1
    4  a.    8    2
    5  a.    9    2