I am trying to calculate the mode value of a column after grouping my dataset. However there is often a tie in the mode and I get multiple result as a list (eg. [2,3,4])
Is it possible to only access the highest value in this list?
My output looks like this:
user session mode_score
98 1 5
2 4
3 5
4 5
5 [2, 3, 4, 5]
However I would like it to look like this:
user session mode_score
98 1 5
2 4
3 5
4 5
5 5
I used this code to get to the mode:
df = pd.DataFrame(df[['user', 'session', 'audio_score']].groupby(['user','session'])['audio_score'].agg(pd.Series.mode))
You can use:
out = (df[['user', 'session', 'audio_score']]
.groupby(['user','session'])['audio_score']
.agg(lambda x: x.mode().max())
)