Search code examples
pythonplotseabornhistogram

Plot multiple distributions in Seaborn histogram


I have a dataframe df_sz with four columns, and I would like to plot the histogram of each column in a "side-by-side" fashion in Seaborn, i.e., with no overlapping between histograms. However, when I run the following script it overlaps the histograms:

sns.histplot(data=df_sz, bins=50, alpha=0.5, shrink=0.8, log_scale=True, multiple='layer')

I have tried with all the options for the multiple argument but none of them works.

Any solution? I really need to use Seaborn for this plot. I have attached a screenshot of the dataframe and the resulting histogram dataframe histogram.


Solution

  • The multiple= parameter seems to go together with the hue= parameter. For this to work, the dataframe should be converted to long form:

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import numpy as np
    
    df_sz = pd.DataFrame({'Ann1': (2 ** np.random.uniform(1, 20, 200)).astype(int),
                          'Ann2': (2 ** np.random.uniform(1, 20, 200)).astype(int),
                          'Ann3': (2 ** np.random.uniform(1, 20, 200)).astype(int),
                          'Ann4': (2 ** np.random.uniform(1, 20, 200)).astype(int)})
    
    fig, ax = plt.subplots(figsize=(20, 4)) # very wide figure to accomodate 200 bars
    sns.histplot(data=df_sz.melt(), x='value', hue='variable', bins=50,
                 alpha=0.5, shrink=0.8, log_scale=True, multiple='dodge', ax=ax)
    ax.legend_.set_title('') # remove the legend title (the name of the hue column)
    ax.margins(x=0.01) # less spacing
    plt.tight_layout()
    plt.show()
    

    sns.histplot with multiple='dodge' using long form df