Search code examples
python-3.xpandasdataframemaxfrequency

Create new column with a list of max frequency values for each row of a pandas dataframe


Given this Dataframe:

df2 = pd.DataFrame([[3,3,3,3,3,3,5,5,5,5],[2,2,2,2,8,8,8,8,6,6]], columns=list('ABCDEFGHIJ'))

   A  B  C  D  E  F  G  H  I  J
0  3  3  3  3  3  3  5  5  5  5
1  2  2  2  2  8  8  8  8  6  6

I created 2 news columns which give for each row the max_freq and the max_freq_value:

df2["max_freq_val"] = df2.apply(lambda x: x.mode().agg(list), axis=1)
df2["max_freq"] = df2.loc[:, df2.columns != "max_freq_val"].apply(lambda x: x.value_counts().max(), axis=1)

   A  B  C  D  E  F  G  H  I  J max_freq_val  max_freq
0  3  3  3  3  3  3  5  5  5  5          [3]         6
1  2  2  2  2  8  8  8  8  6  6       [2, 8]         4

EDIT: I've edited my code inspired by the answer given by @rhug123.

Thanks to all of you for your answers.


Solution

  • Try this, it uses mode()

    df2.assign(max_freq=pd.Series(df2.mode(axis=1).stack().groupby(level=0).agg(list)),
    max_freq_value = df2.eq(df2.mode(axis=1)[0].squeeze(),axis=0).sum(axis=1))
    

    or

    df2.assign(freq = df2.eq((s := df2.mode(axis=1).stack().groupby(level=0).agg(list)).str[0],axis=0).sum(axis=1),val = s)