Search code examples
pythonpandasranking

Finding highest values in each row in a data frame for python


I'd like to find the highest values in each row and return the column header for the value in python. For example, I'd like to find the top two in each row:

df =  
       A    B    C    D  
       5    9    8    2  
       4    1    2    3  

I'd like my for my output to look like this:

df =        
       B    C  
       A    D

Solution

  • You can use a dictionary comprehension to generate the largest_n values in each row of the dataframe. I transposed the dataframe and then applied nlargest to each of the columns. I used .index.tolist() to extract the desired top_n columns. Finally, I transposed this result to get the dataframe back into the desired shape.

    top_n = 2
    >>> pd.DataFrame({n: df.T[col].nlargest(top_n).index.tolist() 
                      for n, col in enumerate(df.T)}).T
       0  1
    0  B  C
    1  A  D