How does one apply a style to an arbitrary subset of a pandas dataframe? Specifically, I have a dataframe df
that contains some NaNs, and I want to apply a background gradient to it everywhere except where there are NaNs (with the same colormap applied to all cells).
I know that background_gradient
(and applymap
more generally) has a subset
parameter, but I do not understand from the documentation how to use it to select an arbitrary subset of the dataframe.
import numpy as np
import pandas as pd
df = pd.DataFrame(data={'A': [0, 1, np.nan], 'B': [.5, np.nan, 0], 'C': [np.nan, 1, 1]})
mask = ~pd.isnull(df)
Then if I try
df.style.background_gradient(subset=mask)
I get the error:
IndexingError: Too many indexers
I know how to apply a style to a subset of a dataframe in the specific case where that subset is a Cartesian product of indices and columns, using something like the solution here: How do I style a subset of a pandas dataframe?. So the question is what to do when the subset is not such a product, as in the example above.
One solution might be to loop through the columns and apply the style column-by-column (then each application is to a Cartesian product subset). In my case, I can pass low
and high
parameters to the background_gradient
method to force the colormaps to match up between columns, but that fails when (as above) one or more of those columns contains a unique non-NaN value. This in turn could be bypassed by rewriting the background_gradient
function, but that's clearly undesirable.
You can write a custom function for this:
from matplotlib.cm import get_cmap
cmap = get_cmap('PuBu')
# update with low-high option
def threshold(x,low=0,high=1,mid=0.5):
# nan cell
if np.isnan(x): return ''
# non-nan cell
x = (x-low)/(high-low)
background = f'background-color: rgba{cmap (x, bytes=True)}'
text_color = f'color: white' if x > mid else ''
return background+';'+text_color
# apply the style
df.style.applymap(threshold, low=-1, high=1, mid=0.3)
Output: