Suppose I have a dataframe:
df = pd.DataFrame({
'A': ['foo', 'foo', 'bar', 'bar', 'bar'],
'B': ['A', 'C', 'F', 'B', 'D']
})
And I am trying to get first alphabetically value of each group
When I tried this
df['B'] = df.groupby('A')['B'].transform('first')
I will get this result
A B
0 foo A
1 foo A
2 bar F
3 bar F
4 bar F
What should I do to get a result like this?
A B
0 foo A
1 foo A
2 bar B
3 bar B
4 bar B
IIUC, sort your dataframe first then groupby-transform and let pandas use intrinsic index alignment to align your results with your original dataframe:
df['new_b'] = df.sort_values('B').groupby('A')['B'].transform('first')
Output:
A B new_b
0 foo A A
1 foo C A
2 bar F B
3 bar B B
4 bar D B