I have a multi-indexed dataframe in pandas in which I want to highlight values above the mean of each "Count" column with respect to each "Id1" subframe. My actual dataframe is much larger, but here is a simplified example:
I'm almost able to get what I want already by iterating through the groupby and applying the style function to each subframe individually.
import pandas as pd
def highlight_max(x):
return ['background-color: yellow' if v > (x.mean()) else '' for v in x]
iterables = [["Land", "Ocean"], ["Liquid", "Ice"]]
index = pd.MultiIndex.from_product(iterables, names=["Id1", "Id2"])
df = pd.DataFrame({'Count A': [12., 70., 30., 20.], 'Count B': [12., 70., 30., 20.]}, index=index)
for id, id_frame in df.groupby('Id1'):
id_frame = id_frame.style.apply(highlight_max, axis=0)
id_frame.to_excel(id+'.xlsx')
The issue is that I want the highlighting to apply to the entire dataframe, without splitting it apart. With my current code, the dataframe is being split up:
I thought about concatenating each subframe together, but they are Styler objects and as far as I'm aware that's not possible. Is there a better solution to this problem?
Instead of calling your function for each group, call it for the whole dataframe. Inside the function, use groupby(level=0).transform('mean')
to get means for each group, and then compare with col > means
:
def s(col):
means = col.groupby(level=0).transform('mean')
return (col > means).map({
True: 'background-color: yellow',
False: '',
})
style = df.style.apply(s)
style
Output: