Search code examples
pythonpandasdataframe

Pandas getting first alphabetically value of each group


Suppose I have a dataframe:

df = pd.DataFrame({
    'A': ['foo', 'foo', 'bar', 'bar', 'bar'],
    'B': ['A', 'C', 'F', 'B', 'D']
})

And I am trying to get first alphabetically value of each group

When I tried this

df['B'] = df.groupby('A')['B'].transform('first')
 

I will get this result

    A   B
0   foo A
1   foo A
2   bar F
3   bar F
4   bar F

What should I do to get a result like this?

    A   B
0   foo A
1   foo A
2   bar B
3   bar B
4   bar B

Solution

  • IIUC, sort your dataframe first then groupby-transform and let pandas use intrinsic index alignment to align your results with your original dataframe:

    df['new_b'] = df.sort_values('B').groupby('A')['B'].transform('first')
    

    Output:

         A  B new_b
    0  foo  A     A
    1  foo  C     A
    2  bar  F     B
    3  bar  B     B
    4  bar  D     B