Search code examples
pythonpandaspandas-groupbymulti-indexpandas-styles

Highlighting values based on groupby in pandas


I have a multi-indexed dataframe in pandas in which I want to highlight values above the mean of each "Count" column with respect to each "Id1" subframe. My actual dataframe is much larger, but here is a simplified example:

Desired output

I'm almost able to get what I want already by iterating through the groupby and applying the style function to each subframe individually.

import pandas as pd


def highlight_max(x):
    return ['background-color: yellow' if v > (x.mean()) else '' for v in x]

iterables = [["Land", "Ocean"], ["Liquid", "Ice"]]

index = pd.MultiIndex.from_product(iterables, names=["Id1", "Id2"])

df = pd.DataFrame({'Count A': [12., 70., 30., 20.], 'Count B': [12., 70., 30., 20.]}, index=index)

for id, id_frame in df.groupby('Id1'):
    id_frame = id_frame.style.apply(highlight_max, axis=0)
    id_frame.to_excel(id+'.xlsx')

The issue is that I want the highlighting to apply to the entire dataframe, without splitting it apart. With my current code, the dataframe is being split up:

Frame 1

Frame 2

I thought about concatenating each subframe together, but they are Styler objects and as far as I'm aware that's not possible. Is there a better solution to this problem?


Solution

  • Instead of calling your function for each group, call it for the whole dataframe. Inside the function, use groupby(level=0).transform('mean') to get means for each group, and then compare with col > means:

    def s(col):
        means = col.groupby(level=0).transform('mean')
        return (col > means).map({
            True: 'background-color: yellow',
            False: '',
        })
    
    style = df.style.apply(s)
    style
    

    Output:

    enter image description here