I have the following dataframe:
index | A | B |
---|---|---|
0 | a | 3 |
1 | a | 4 |
2 | b | 9 |
3 | b | 6 |
4 | a | 2 |
5 | b | 1 |
And I would like to get the second last value of each group of column "A". I already figured out how to get the min() value with a groupby :
df_grouped_last = df.groupby('A').agg({'B': ['min']})
But I need to get the second last value ("before last") so I can get :
index | A | 2nd last B |
---|---|---|
0 | a | 3 |
1 | b | 6 |
I will also need the third last and fourth in another work.
Any chance someone know how to code it ?
Thanks a lot ! Vincent
Looking at your expected output, the assumption is that column B
is sorted for each group. If that is the case, use sort_values, combined with nth:
(df.sort_values(['A', 'B'])
.groupby('A', sort = False)
.B
.nth(-2) # familiar python construct ...
# takes second value from the bottom, per group
.reset_index()
)
A B
0 a 3
1 b 6