Search code examples
pythonpandasdataframemulti-indexstyler

Pandas Styler - Custom Formatter for Names of MultiIndex


I've been trying to format a pandas dataframe using the styler. The behavior of format_index() seems a bit unpredictable when using a MultiIndex and I can't figure out a way to format the names of the MultiIndex.

The following is a MWE with an extremely stupid formatter:

import pandas as pd
import numpy as np

data = {
    'Index1': ['A', 'B', 'C'],
    'Index2': ['X', 'Y', 'Z'],
    'Index3': ['1', '2', '3'],
    'Value1': np.random.randint(1, 100, 3),
    'Value2': np.random.randint(1, 100, 3),
    'Value3': np.random.randint(1, 100, 3)
}
df = pd.DataFrame(data)
df.set_index(['Index1', 'Index2', 'Index3'], inplace=True)

def custom_formatter(value):
    return 'S' + str(value)

styled_df = df.rename_axis(index=custom_formatter, columns=custom_formatter).style
styled_df = styled_df.format(custom_formatter).format_index(custom_formatter, axis=1).format_index(custom_formatter, axis = 0)
latex_table = styled_df.to_latex()
print(latex_table)

This results in

\begin{tabular}{lllrrr}
\toprule
 &  & SNone & SValue1 & SValue2 & SValue3 \\
SIndex1 & SIndex2 & SIndex3 &  &  &  \\
\midrule
SA & SX & S1 & S62 & S81 & S52 \\
SB & SY & S2 & S15 & S24 & S25 \\
SC & SZ & S3 & S22 & S36 & S48 \\
\bottomrule
\end{tabular}

This is nearly what I want (i.e. the formatter is used for all table elements), however the SNone should actually be blank - there is no value defined there. If I remove the .rename_axis(...) portion, it is blank.

Does anyone have any other ideas how to format the MultiIndex properly or how to get rid of this bug? My main goal here in having the indices is for common values to be grouped in multirows in latex, if there are ways to achieve that without indices, I'd also be open to suggestions.


Solution

  • That's because df.columns.name is None and when you cast it to str, you get a literal "None".

    You can fix it this way :

    def custom_formatter(value):
        return f"S{value}" if value else ""
    

    Output :

    >>> print(styled_df.to_latex())
    
    \begin{table}
    \.index_name.level0:nth-of-type(3)lightgreen
    \begin{tabular}{lllrrr}
     &  &  & SValue1 & SValue2 & SValue3 \\
    SIndex1 & SIndex2 & SIndex3 &  &  &  \\
    SA & SX & S1 & S45 & S68 & S84 \\
    SB & SY & S2 & S48 & S68 & S22 \\
    SC & SZ & S3 & S65 & S10 & S37 \\
    \end{tabular}
    \end{table}
    

    enter image description here