Search code examples
pythonmatplotlibseabornhistogrampairplot

How to get a stacked histogram in PairGrid or pairplot


I whish to reproduce the PairGrid plot found in that tutorial, but locally my barcharts are not stacked as in the tutorial and I can't figure out how to make them so.

import seaborn as sns
import matplotlib.pyplot as plt  # for graphics
import os
os.sys.version
# '3.6.4 (default, Sep 20 2018, 19:07:50) \n[GCC 5.4.0 20160609]'

sns.__version__    
# '0.9.0'

mpg = sns.load_dataset('mpg')

g = sns.PairGrid(data=mpg[["mpg", "horsepower", "weight", "origin"]], hue="origin")
g.map_upper(sns.regplot)
g.map_lower(sns.residplot)

# below for the histogram
g.map_diag(plt.hist)

# also I tried
# g.map_diag(lambda x, label, color: plt.hist(x, label=label, color=color, histtype='barstacked', alpha=.4))
# g.map_diag(plt.hist, histtype='barstacked')
# but same result

g.savefig('./Plots/mpg.svg')

My plots

The tutorial's plot

Do I have to follow the second answer of this post answer suggesting that it is very tricky to do with seaborn, or should I turn to back to plt as suggested here for a simpler chart ?

In any case I'm curious to understand how they stacked their bars in the tutorial linked above.


Solution

  • The option for stacked histograms on the diagonal of a PairGrid has been removed from seaborn in this commit and hence is not available anymore in seaborn 0.9.

    A workaround could be to collect all the data first and then plot it to the respective axes.

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd 
    
    df = sns.load_dataset('mpg')
    
    g = sns.PairGrid(data=df[["mpg", "horsepower", "weight", "origin"]], hue="origin")
    g.map_upper(sns.regplot)
    g.map_lower(sns.residplot)
    
    # below for the histograms on the diagonal
    d = {}
    def func(x, **kwargs):
        ax = plt.gca()
    
        if not ax in d.keys():
            d[ax] = {"data" : [], "color" : []}
        d[ax]["data"].append(x)
        d[ax]["color"].append(kwargs.get("color"))
    
    g.map_diag(func)
    for ax, dic in d.items():
        ax.hist(dic["data"], color=dic["color"], histtype="barstacked")
    
    plt.show()
    

    enter image description here