I try to apply styler to the pivot table based on the next condition: if the percentage is in the range from 0 to 100, then color it yellow, if more than 100, then color it red, but after that I take a correction for another value df['Value']
and if the value in the same row is less than 10, then everything in this row is repainted in gray in the percentage column.
The DataFrame I have:
df = pd.DataFrame({"Name": ["A", "A","A","A","A","A","A","A",
"B", "B","B","B","B","B","B","B",
"C", "C","C","C","C","C","C","C",
"D", "D","D","D","D","D","D","D",
],
"Year": [1993, 1993,1993,1993,1994,1994,1994,1994,
1993, 1993,1993,1993,1994,1994,1994,1994,
1993, 1993,1993,1993,1994,1994,1994,1994,
1993, 1993,1993,1993,1994,1994,1994,1994],
"Month": ["jun", "july", "august", "september","jun", "july", "august", "september",
"jun", "july", "august", "september","jun", "july", "august", "september",
"jun", "july", "august", "september","jun", "july", "august", "september",
"jun", "july", "august", "september","jun", "july", "august", "september"],
"Value": [10, 8, 3, 12, 10, 8, 3, 12,
10, 8, 3, 12, 10, 8, 3, 12,
10, 8, 3, 12, 10, 8, 3, 12,
10, 8, 3, 12, 10, 8, 3, 12,],
"Procent": [172, 54, 101, 89, 14, 35, 48, 79,
172, 54, 101, 89, 14, 35, 48, 79,
172, 54, 101, 89, 14, 35, 48, 79,
172, 54, 101, 89, 14, 35, 48, 79]})
df = pd.pivot_table(df,
index=["Name"],
values=["Value", "Procent"],
columns=["Year", "Month"])
I tried to create next function:
def color(row):
result = ["" for _ in df.loc[:, idx[['Procent', 'Value'], :]].columns]
if ((row['Procent'] > 0) & (row['Procent'] < 100)):
result[row["Procent"]] = "background-color: yellow"
if row['Procent'] >= 100:
result[row["Procent"]] = "background-color: red"
if row['Value'] < 10:
result[row["Procent"]] = "background-color: silver"
return result
df = df.style.\
apply(color, axis=1)
But this return mistake The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
.
How can I apply styler to achieve this result?
There are a few options here. The simplest in terms of lines of code is to use np.select with the conditions:
def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
styles_df['Procent'] = np.select([
# Condition 1
df_slice['Value'] < 10,
# Condition 2
(0 < df_slice['Procent']) & (df_slice['Procent'] < 100),
# Condition 3
df_slice['Procent'] >= 100,
], [
# Color for Condition 1
'background-color: silver',
# Color for Condition 2
'background-color: yellow',
# Color for Condition 3
'background-color: red'
])
return styles_df
df.style.apply(apply_colors, axis=None)
Notice, the silver condition has to come first. Only a single condition can be true for np.select, so whatever colour is to be displayed must be the first match. So we need to slightly reorder the conditions to have the correct results display.
Additionally note the above is not index safe as we're directly applying the Values
numbers to the corresponding 2d position in Procent
. This assumes the level 1 and 2 column indexes being exactly identical between the two top-level column indexes.
If this is not a safe assumptions we can first reindex_like to ensure that the indexes do align. This changes condition one to:
df_slice['Value'].reindex_like(df_slice['Procent']) < 10
All together an index safe version could look like:
def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
styles_df['Procent'] = np.select([
# Condition 1 (ensure Value is index aligned with Procent)
df_slice['Value'].reindex_like(df_slice['Procent']) < 10,
# Condition 2
(0 < df_slice['Procent']) & (df_slice['Procent'] < 100),
# Condition 3
df_slice['Procent'] >= 100,
], [
# Color for Condition 1
'background-color: silver',
# Color for Condition 2
'background-color: yellow',
# Color for Condition 3
'background-color: red'
])
return styles_df
df.style.apply(apply_colors, axis=None)
Alternatively, since pandas styler objects are not designed for large repr, we usually have some leeway to do things less efficiently than in data operations with pandas.
It may make sense, depending on the actual complexity of the conditions, to do multiple Styler.apply operations and perform the operation exactly as described.
Value
s are less than 10.def initial_color(df_slice: pd.DataFrame) -> np.ndarray:
styles = np.select([
# Condition 1
(0 < df_slice) & (df_slice < 100),
# Condition 2
df_slice >= 100
], [
# Color for Condition 1
'background-color: yellow',
# Color for Condition 2
'background-color: red'
])
return styles
def recolor_gray(df_slice: pd.DataFrame) -> pd.DataFrame:
# Create an empty DataFrame of styles
styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
# Make background silver where Value is less than 10
styles_df['Procent'] = np.where(
df_slice['Value'] < 10, 'background-color: silver', ''
)
# Return the styles
return styles_df
(
df.style
.apply(initial_color, subset='Procent', axis=None)
.apply(recolor_gray, axis=None)
)
Note that this method, specifically this line:
styles_df['Procent'] = np.where(
df_slice['Value'] < 10, 'background-color: silver', ''
)
makes the same assumption as above, that Value
and Procent
are identically indexed (as shown in the display).
However if this is not guaranteed to be the case, we can again reindex_like Procent
so the shapes are guaranteed to align properly:
def recolor_gray(df_slice: pd.DataFrame) -> pd.DataFrame:
# Create an empty DataFrame of styles
styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
# Make background silver where Value is less than 10
styles_df['Procent'] = np.where(
# Ensure indexes align correctly
df_slice['Value'].reindex_like(df_slice['Procent']) < 10,
'background-color: silver', ''
)
# Return the styles
return styles_df
This results in the same table; the same operations are simply performed in a different order:
Since pandas styler behaviour can very significantly between versions the above answer was written in pandas version 1.4.2 and numpy version 1.23.2.
Imports and aliases are as follows:
import numpy as np # v1.23.2
import pandas as pd # v1.4.3