Search code examples
pythonpandasnumpyfrequencymode

How to get the most frequent row in table


How to get the most frequent row in a DataFrame? For example, if I have the following table:

   col_1  col_2 col_3
0      1      1     A
1      1      0     A
2      0      1     A
3      1      1     A
4      1      0     B
5      1      0     C

Expected result:

   col_1  col_2 col_3
0      1      1     A

EDIT: I need the most frequent row (as one unit) and not the most frequent column value that can be calculated with the mode() method.


Solution

  • In Pandas 1.1.0. is possible to use the method value_counts() to count unique rows in DataFrame:

    df.value_counts()
    

    Output:

    col_1  col_2  col_3
    1      1      A        2
           0      C        1
                  B        1
                  A        1
    0      1      A        1
    

    This method can be used to find the most frequent row:

    df.value_counts().head(1).index.to_frame(index=False)
    

    Output:

       col_1  col_2 col_3
    0      1      1     A