I have a dataframe
df =
Col Val
a. 8
a. 9
c. 4
c. 0
d. 3
d. 9
I want to sort by Val of the smallest value within group and then foreach row get the index of the groupby Col So the new df will df
df_new =
Col Val Idx
c. 4. 0
c. 0. 0
d. 3. 1
d. 9. 1
a. 8. 2
a. 9. 2
What is the best way to do so?
Use GroupBy.transform
with min
to new column, then for starting by 1
with range use Series.rank
and then sorting by 2 columns - if same minimal value per multiple groups get same groups by Col
together:
df['Idx'] = (df.groupby('Col')['Val'].transform('min')
.rank(method='dense').astype(int).sub(1))
df = df.sort_values(['Idx','Col'], ignore_index=True)
print (df)
Col Val Idx
0 c. 4 0
1 c. 0 0
2 d. 3 1
3 d. 9 1
4 a. 8 2
5 a. 9 2