I want to get the trip_time with the highest count for each age_group
age_group | trip_time | counts |
---|---|---|
18 - 30yrs | 01am | 23 |
18 - 30yrs | 02am | 2 |
18 - 30yrs | 03am | 213 |
31 - 50yrs | 01am | 74 |
31 - 50yrs | 02am | 211 |
31 - 50yrs | 03am | 852 |
51 - 70yrs | 01am | 23 |
51 - 70yrs | 02am | 11 |
51 - 70yrs | 03am | 101 |
Expected output:
age_group | trip_time | counts |
---|---|---|
18 - 30yrs | 03am | 213 |
31 - 50yrs | 03am | 852 |
51 - 70yrs | 03am | 101 |
trip_time_age_group.groupby(['age_group', 'trip_time'])['counts'].max()
But it gives me wrong result
groupby 'age-group' and then use transform to get the max count for each age-group, which you then compare with the count of the DF, to get the resultset
df.loc[df.groupby('age_group')['counts'].transform('max').eq(df['counts'])]
age_group trip_time counts
2 18 - 30yrs 03am 213
5 31 - 50yrs 03am 852
8 51 - 70yrs 03am 101