Search code examples
pythonpandaspivot-tablepandas-styles

Apply styler to pivot table


I try to apply styler to the pivot table based on the next condition: if the percentage is in the range from 0 to 100, then color it yellow, if more than 100, then color it red, but after that I take a correction for another value df['Value'] and if the value in the same row is less than 10, then everything in this row is repainted in gray in the percentage column.

The DataFrame I have:

df = pd.DataFrame({"Name": ["A", "A","A","A","A","A","A","A",
                            "B", "B","B","B","B","B","B","B",
                            "C", "C","C","C","C","C","C","C",
                            "D", "D","D","D","D","D","D","D",
                            ],
                    "Year": [1993, 1993,1993,1993,1994,1994,1994,1994,
                                1993, 1993,1993,1993,1994,1994,1994,1994,
                               1993, 1993,1993,1993,1994,1994,1994,1994,
                               1993, 1993,1993,1993,1994,1994,1994,1994],
                    "Month": ["jun", "july", "august", "september","jun", "july", "august", "september",
                            "jun", "july", "august", "september","jun", "july", "august", "september",
                            "jun", "july", "august", "september","jun", "july", "august", "september",
                            "jun", "july", "august", "september","jun", "july", "august", "september"],
                    "Value": [10, 8, 3, 12, 10, 8, 3, 12,
                            10, 8, 3, 12, 10, 8, 3, 12,
                            10, 8, 3, 12, 10, 8, 3, 12,
                            10, 8, 3, 12, 10, 8, 3, 12,],
                    "Procent": [172, 54, 101, 89, 14, 35, 48, 79,
                            172, 54, 101, 89, 14, 35, 48, 79,
                            172, 54, 101, 89, 14, 35, 48, 79,
                            172, 54, 101, 89, 14, 35, 48, 79]})
df = pd.pivot_table(df,
                     index=["Name"],
                     values=["Value", "Procent"],
                     columns=["Year", "Month"])

I tried to create next function:

def color(row):
    result = ["" for _ in df.loc[:, idx[['Procent', 'Value'], :]].columns]
    if ((row['Procent'] > 0) & (row['Procent'] < 100)):
        result[row["Procent"]] = "background-color: yellow"
    if row['Procent'] >= 100:
        result[row["Procent"]] = "background-color: red"
    if row['Value'] < 10:
        result[row["Procent"]] = "background-color: silver"
    
    return result

df = df.style.\
    apply(color, axis=1)

But this return mistake The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I apply styler to achieve this result?

enter image description here


Solution

  • There are a few options here. The simplest in terms of lines of code is to use np.select with the conditions:

    def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
        styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
        styles_df['Procent'] = np.select([
            # Condition 1
            df_slice['Value'] < 10,
            # Condition 2
            (0 < df_slice['Procent']) & (df_slice['Procent'] < 100),
            # Condition 3
            df_slice['Procent'] >= 100,
        ], [
            # Color for Condition 1
            'background-color: silver',
            # Color for Condition 2
            'background-color: yellow',
            # Color for Condition 3
            'background-color: red'
        ])
        return styles_df
    
    
    df.style.apply(apply_colors, axis=None)
    

    Notice, the silver condition has to come first. Only a single condition can be true for np.select, so whatever colour is to be displayed must be the first match. So we need to slightly reorder the conditions to have the correct results display.

    Additionally note the above is not index safe as we're directly applying the Values numbers to the corresponding 2d position in Procent. This assumes the level 1 and 2 column indexes being exactly identical between the two top-level column indexes.

    If this is not a safe assumptions we can first reindex_like to ensure that the indexes do align. This changes condition one to:

    df_slice['Value'].reindex_like(df_slice['Procent']) < 10
    

    All together an index safe version could look like:

    def apply_colors(df_slice: pd.DataFrame) -> pd.DataFrame:
        styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
        styles_df['Procent'] = np.select([
            # Condition 1 (ensure Value is index aligned with Procent)
            df_slice['Value'].reindex_like(df_slice['Procent']) < 10,
            # Condition 2
            (0 < df_slice['Procent']) & (df_slice['Procent'] < 100),
            # Condition 3
            df_slice['Procent'] >= 100,
        ], [
            # Color for Condition 1
            'background-color: silver',
            # Color for Condition 2
            'background-color: yellow',
            # Color for Condition 3
            'background-color: red'
        ])
        return styles_df
    
    
    df.style.apply(apply_colors, axis=None)
    

    Resulting Styled Table


    Alternatively, since pandas styler objects are not designed for large repr, we usually have some leeway to do things less efficiently than in data operations with pandas.

    It may make sense, depending on the actual complexity of the conditions, to do multiple Styler.apply operations and perform the operation exactly as described.

    1. Values between 0 and 100 (non inclusive) are coloured with yellow,
    2. anything over 100 gets coloured with red,
    3. then after go back through where the corresponding Values are less than 10.
    def initial_color(df_slice: pd.DataFrame) -> np.ndarray:
        styles = np.select([
            # Condition 1
            (0 < df_slice) & (df_slice < 100),
            # Condition 2
            df_slice >= 100
        ], [
            # Color for Condition 1
            'background-color: yellow',
            # Color for Condition 2
            'background-color: red'
        ])
        return styles
    
    
    def recolor_gray(df_slice: pd.DataFrame) -> pd.DataFrame:
        # Create an empty DataFrame of styles
        styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
        # Make background silver where Value is less than 10
        styles_df['Procent'] = np.where(
            df_slice['Value'] < 10, 'background-color: silver', ''
        )
        # Return the styles
        return styles_df
    
    
    (
        df.style
            .apply(initial_color, subset='Procent', axis=None)
            .apply(recolor_gray, axis=None)
    )
    

    Note that this method, specifically this line:

    styles_df['Procent'] = np.where(
        df_slice['Value'] < 10, 'background-color: silver', ''
    )
    

    makes the same assumption as above, that Value and Procent are identically indexed (as shown in the display).

    However if this is not guaranteed to be the case, we can again reindex_like Procent so the shapes are guaranteed to align properly:

    def recolor_gray(df_slice: pd.DataFrame) -> pd.DataFrame:
        # Create an empty DataFrame of styles
        styles_df = pd.DataFrame('', index=df_slice.index, columns=df_slice.columns)
        # Make background silver where Value is less than 10
        styles_df['Procent'] = np.where(
            # Ensure indexes align correctly
            df_slice['Value'].reindex_like(df_slice['Procent']) < 10,
            'background-color: silver', ''
        )
        # Return the styles
        return styles_df
    

    This results in the same table; the same operations are simply performed in a different order:

    Resulting Styled Table


    Since pandas styler behaviour can very significantly between versions the above answer was written in pandas version 1.4.2 and numpy version 1.23.2.

    Imports and aliases are as follows:

    import numpy as np  # v1.23.2
    import pandas as pd  # v1.4.3