Search code examples
pythonpandasdataframenumpymin

Second last value (pandas, Python)


I have the following dataframe:

index A B
0 a 3
1 a 4
2 b 9
3 b 6
4 a 2
5 b 1

And I would like to get the second last value of each group of column "A". I already figured out how to get the min() value with a groupby :

df_grouped_last = df.groupby('A').agg({'B': ['min']})

But I need to get the second last value ("before last") so I can get :

index A 2nd last B
0 a 3
1 b 6

I will also need the third last and fourth in another work.

Any chance someone know how to code it ?

Thanks a lot ! Vincent


Solution

  • Looking at your expected output, the assumption is that column B is sorted for each group. If that is the case, use sort_values, combined with nth:

    (df.sort_values(['A', 'B'])
       .groupby('A', sort = False)
       .B
       .nth(-2) # familiar python construct ... 
                # takes second value from the bottom, per group
       .reset_index()
     )
    
       A  B
    0  a  3
    1  b  6