Search code examples
pythonpandaspandas-styles

how to highlight and format a given column?


Consider the simple example below

df = pd.DataFrame({'text' : ['hello world'],
                   'var1' : [0.1],
                   'var2' : [0.2]})

#this highlights the highest number in red
def highlight_max(s, props = ''):
    return np.where(s == np.nanmax(s.values), props, '')

#this turns the numbers into percentages
df.style.format({'var1' : '{:,.2%}'.format,
                 'var2' : '{:,.2%}'.format})

I am trying to automatically highlight the highest number (by rows) in red AND format any number into percentage. I have the two pieces separately but I do not know how to do this in just one .style call. Any ideas?

Thanks!


Solution

  • highlight_max is a builtin function, we can chain calls to produce a styled table, or store the styler as a variable and reuse it to apply multiple styles:

    numeric_cols = ['var1', 'var2']
    (
        df.style.highlight_max(color='red', axis=1, subset=numeric_cols)
            .format('{:,.2%}'.format, subset=numeric_cols)
    )
    

    Or

    numeric_cols = ['var1', 'var2']
    styler = df.style
    styler.highlight_max(color='red', axis=1, subset=numeric_cols)
    styler.format('{:,.2%}'.format, subset=numeric_cols)
    

    The user-defined function can be used as well by using Styler.apply on axis=1 and defining props to the desired CSS styles:

    #this highlights the highest number in red
    def highlight_max(s, props = ''):
        return np.where(s == np.nanmax(s.values), props, '')
    
    numeric_cols = ['var1', 'var2']
    #this turns the numbers into percentages
    (
        df.style
            .apply(highlight_max, props='background-color: red',
                   subset=numeric_cols, axis=1)
            .format('{:,.2%}'.format, subset=numeric_cols)
    )
    

    All options produce styled DataFrame:

    styled dataframe


    If needed the numeric_cols list of columns can be generated dynamically with something like:

    numeric_cols = df.select_dtypes(include='number').columns
    

    Or:

    numeric_cols = df.columns[(df.dtypes == 'int64') | (df.dtypes == 'float64')]