Search code examples
pythonpandasseabornheatmap

Why are values getting converted from int to float in my Seaborn heatmap?


I created a "Total" column and a "Total" row. It was working as expected, displaying the total as an integer, up until the moment when I tried to get Seaborn not to include cells (in the whole table, not only in that column and row) where the value was 0, and then the total column and row changed to float (with a trailing .0) and I don't know how to fix it.

df = pd.DataFrame(
        values,
        index=index,
        columns=columns
    )

df['Total'] = df.sum(axis=1, numeric_only=True)
df.loc[f'Total'] = df.sum(numeric_only=True)

df[df == 0] = np.nan

# This mask is, as far as I know, unrelated to what I'm discussing here
mask = np.zeros_like(df, dtype=bool)
mask[:, -1] = True
mask[0, :] = True

sns.heatmap(
        data=df,
        vmin=0,
        vmax=values.max(),
        annot=True,
        fmt='g',
        cmap=colormap,
        linewidths=0.5,
        linecolor='black',
        mask=mask,
        annot_kws={
            'size': 10,
        }
    )

Resulting table

The empty white cells (with no values) inside the table are exactly what I wanted to achieve. The total column and row are supposed to look like that, and that's what the mask is for, but I didn't include the code for that for simplicity.

To try to fix the float values, I tried typecasting that column and row, but nothing happened:

df['Total'] = df['Total'].astype(np.int64)
df.loc[f'Total'] = df.loc[f'Weekly Total'].astype(np.int64)

Solution

  • The problem in your code is hard to pinpoint, as you omitted the important part: how did you annotate the total row and column?

    There is also a logical mistake: columns containing nan can't be of integer type (nan can only be a float). As such, you can't convert the total row to integer. I.e. df.loc[f'Total'] = df.loc[f'Weekly Total'].astype(np.int64) will still put floats in all elements.

    You can create the desired heatmap in two steps:

    • Use the inverted mask to plot the total column and row, using a colormap that only uses a single color (e.g. cmap=ListedColormap(['white']))
    • Use the original mask to plot the rest of the heatmap

    To force values being shown as integers, you might try fmt='.0f' (float with 0 decimals), but that is not strictly needed here.

    import matplotlib.pyplot as plt
    from matplotlib.colors import ListedColormap
    import numpy as np
    import seaborn as sns
    import pandas as pd
    
    # create some test data
    values = np.random.binomial(n=20, p=0.1, size=(10, 8))
    df = pd.DataFrame(
        values,
        index=range(10),
        columns=[*'abcdefgh']
    )
    
    df['Total'] = df.sum(axis=1, numeric_only=True)
    df.loc['Total'] = df.sum(numeric_only=True)
    
    df[df == 0] = np.nan  # this converts all columns to float, as np.nan can't be an integer
    
    mask = np.zeros_like(df, dtype=bool)
    mask[:, -1] = True
    mask[df.index.get_loc('Total'), :] = True
    
    ax = sns.heatmap(
        data=df,
        annot=True,
        fmt='g',
        cmap=ListedColormap(['lightblue']),  # a colormap with only one color
        linewidths=0.5,
        linecolor='black',
        mask=~mask,  # use the inverted mask
        annot_kws={'size': 10},
        cbar=False  # suppress the colorbar
    )
    
    sns.heatmap(
        data=df,
        vmin=0,
        vmax=values.max(),
        annot=True,
        fmt='g',
        cmap='RdYlGn',
        linewidths=0.5,
        linecolor='black',
        mask=mask,
        annot_kws={'size': 10},
        clip_on=False,  # show the cell borders also at the border of the 
        ax=ax
    )
    
    # rotate the tick labels horizontally, don't show tick marks
    ax.tick_params(rotation=0, length=0)
    # optionally reverse the direction of the y-axis
    # ax.invert_yaxis()
    
    # optionally cross hatch the empty cells
    ax.patch.set_edgecolor('lightgrey')
    ax.patch.set_hatch('xxxx')
    
    plt.show()
    

    sns.heatmap with total row and column