Search code examples
pythonpandashighlight

Highlighting nlargest values of columns in a pandas df


I have highlighted in yellow max values of my df with the following code:

def highlight_max(s):
    is_max = s == s.max()
    return ['background-color: yellow' if v else '' for v in is_max]

pivot_p.style.apply(highlight_max)

But now I want to highlight the 5 largest values of each column. I have tried the following code, but it's not working:

def highlight_large(s):
    is_large = s == s.nlargest(5)
    return ['background-color: yellow' if v else '' for v in is_large]

pivot_p.style.apply(highlight_large)

Error:

ValueError: ('Can only compare identically-labeled Series objects', 'occurred at index %_0')

Solution

  • You can try:

    def highlight_max(s):
        is_large = s.nlargest(5).values
        return ['background-color: yellow' if v in is_large else '' for v in s]
    

    Full example:

    # Import modules
    import pandas as pd
    import numpy as np
    
    # Create example dataframe
    pivot_p = pd.DataFrame({"a": np.random.randint(0,15,20),
                      "b": np.random.random(20)})
    
    def highlight_max(s):
        # Get 5 largest values of the column
        is_large = s.nlargest(5).values
        # Apply style is the current value is among the 5 biggest values
        return ['background-color: yellow' if v in is_large else '' for v in s]
    
    pivot_p.style.apply(highlight_max)
    

    Output:

    enter image description here