Search code examples
pythonmatplotlibseabornpairplot

How to make a pairplot have a diagonal histogram with a hue using seaborn?


I'm trying to make a pairplot of kind scatter plot with histogram diagonals, but when adding a hue the histograms become invalid.

My code before hue:

import seaborn as sn
sn.pairplot(dropped_data)

Output: Output1

My code after adding hue:

sn.pairplot(dropped_data, hue='damage rating')

Output: Output 2

What I have tried:

sn.pairplot(dropped_data, hue='damage rating', diag_kind='hist', kind='scatter')

Output: Output 3

As you can see, when using a hue, the diagonal histogram it goes all weird and becomes incorrect. How can I fix this?


Solution

  • It looks like the hue column is continuous and contains only unique values. As the diagonal is build up of kdeplots, those won't work when each kde is build from only one value.

    One way to tackle this, is using stacked histplots. This might be slow when a lot of data is involved.

    Another approach is to make the hue column discrete, e.g. by rounding them.

    A reproducible example

    First, let's try to recreate the problem with easily reproducible data:

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import numpy as np
    
    np.random.seed(20220226)
    df = pd.DataFrame({f'Sensor {i}': np.random.randn(100) for i in range(1, 4)})
    df['damage'] = np.random.rand(100)
    
    sns.pairplot(df, hue="damage")
    

    sns.pairplot using continuous hue

    Working with a stacked histogram

    sns.pairplot(df, hue="damage", diag_kind='hist', diag_kws={'multiple': 'stack'})
    

    sns.pairplot with stacked histplot

    Making the hue column discrete via rounding:

    df['damage'] = (df['damage'] * 5).round() / 5  # round to multiples of 0.2
    sns.pairplot(df, hue="damage")
    

    sns.pairplot with rounded hue values