Search code examples
pythonpython-3.xseabornhistogramdisplot

seaborn histplot and displot output doesn't match


  • Histograms generated by seaborn.histplot and seaborn.displot do not match.
    • Default plot for sns.displot is kind='hist'
  • Tested with python3.8.11, seaborn 0.11.2, and matplotlib 3.4.2
  • Why do the outputs not match, and how can this be resolved?
  • The expectation is, given bins, the density of the corresponding plots should match.
  • Information contained in Visualizing distributions of data doesn't resolve the question.
import seaborn as sns
import matplotlib.pyplot as plt

# sample data: wide
dfw = sns.load_dataset("penguins", cache=False)[['bill_length_mm', 'bill_depth_mm']].dropna()

# sample data: long
dfl = dfw.melt(var_name='bill_size', value_name='vals')

seaborn.displot

  1. Ignores 'sharex': False, though 'sharey' works
  2. Ignores bins
fg = sns.displot(data=dfl, x='vals', col='bill_size', kde=True, stat='density', bins=12, height=4, facet_kws={'sharey': False, 'sharex': False})
plt.show()

enter image description here

  1. Setting xlim doesn't make a difference
fg = sns.displot(data=dfl, x='vals', col='bill_size', kde=True, stat='density', bins=12, height=4, facet_kws={'sharey': False, 'sharex': False})
axes = fg.axes.ravel()
axes[0].set_xlim(25, 65)
axes[1].set_xlim(13, 26)
plt.show()

enter image description here

seaborn.histplot

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 4))

sns.histplot(data=dfw.bill_length_mm, kde=True, stat='density', bins=12, ax=ax1)
sns.histplot(data=dfw.bill_depth_mm, kde=True, stat='density', bins=12, ax=ax2)
fig.tight_layout()
plt.show()

enter image description here

Update

  • As suggested by mwaskom, common_bins=False gets the histograms into the same shape, resolving the issues of ignoring bins and sharex. However, the density seems to be affected by the number of plots in the displot.
    • If there are 3 plots in the displot, then the density is 1/3 that shown in the histplot; for 2 plots, the density is 1/2.

enter image description here

enter image description here


Solution

    • As suggested by mwaskom in a comment, common_bins=False gets the histograms into the same shape, resolving the issues of ignoring bins and sharex, and the density in a faceted plot is scaled by the number of data points in each facet, not the number of facets.
    • The issue with density being split by the number of plots in displot is resolved by using common_norm=False

    enter image description here

    enter image description here

    Plot Code

    # displot
    fg = sns.displot(data=dfl, x='vals', col='bill_size', kde=True, stat='density', bins=12, height=4,
                     facet_kws={'sharey': False, 'sharex': False}, common_bins=False, common_norm=False)
    
    fg.fig.subplots_adjust(top=0.85)
    fg.fig.suptitle('Displot with common_bins & common_norm as False')
    plt.show()
    
    # histplot
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 4))
    
    sns.histplot(data=dfw.bill_length_mm, kde=True, stat='density', bins=12, ax=ax1)
    sns.histplot(data=dfw.bill_depth_mm, kde=True, stat='density', bins=12, ax=ax2)
    
    fig.subplots_adjust(top=0.85)
    fig.suptitle('Histplot')
    
    fig.tight_layout()
    plt.show()