I have a dataset containing hourly temperatures for a year. So, I have 24 entries for each day (temp for every hour) and I want to find out the 5 days with highest temp. I am aware of nlargest() function to find out 5 max values but those values happen to be on a single day only. How do I find out the 5 max values but on different days?
I tried using nlargest() and .loc() but could not find the solution. Please help.
I have attached what the dataset looks like.
You might want to get the max per group with groupby.max
then find the top 5 with nlargest
df.groupby(['year','month','day'])['temp'].max().nlargest(5)