I have a df with several duplicated values in column B.
What I need is to look for the most recent date of column A for each value in column B and relove the lines that are not the most recent:
A B E
26/12/2023 apple 7,9
26/12/2022 apple 8,3
26/12/2023 pear 28,6
26/12/2022 orange 33,3
26/12/2023 wildberry 24,7
26/12/2022 wildberry 29,1
26/12/2023 grapes 17,1
The result should be :
A B E
26/12/2023 apple 7,9
26/12/2023 pear 28,6
26/12/2022 orange 33,3
26/12/2023 wildberry 24,7
26/12/2023 grapes 17,1
Could you help me find the correct formula? I am a beginner and got lost in a loc function
I am a beginner and got lost in a loc function
You can use group by:
df.groupby('B')['A'].max()
https://scales.arabpsychology.com/stats/how-to-find-the-max-value-by-group-in-pandas/