Search code examples
pythonmatplotlibseabornhistogram

How to add a mean and median line to a Seaborn displot


Is there a way to add the mean and median to Seaborn's displot?

penguins = sns.load_dataset("penguins")
g = sns.displot(
    data=penguins, x='body_mass_g',
    col='species',  
    facet_kws=dict(sharey=False, sharex=False)
)

enter image description here

Based on Add mean and variability to seaborn FacetGrid distplots, I see that I can define a FacetGrid and map a function. Can I pass a custom function to displot?

The reason for trying to use displot directly is that the plots are much prettier out of the box, without tweaking tick label size, axis label size, etc. and are visually consistent with other plots I am making.

def specs(x, **kwargs):
    ax = sns.histplot(x=x)
    ax.axvline(x.mean(), color='k', lw=2)
    ax.axvline(x.median(), color='k', ls='--', lw=2)

g = sns.FacetGrid(data=penguins, col='species')
g.map(specs,'body_mass_g' )

enter image description here


Solution

  • Option 1

    • Use plt. instead of ax.
      • In the OP, the vlines are going to ax for the histplot, but here, the figure is created before .map.
    penguins = sns.load_dataset("penguins")
    g = sns.displot(
        data=penguins, x='body_mass_g',
        col='species',  
        facet_kws=dict(sharey=False, sharex=False)
    )
    
    def specs(x, **kwargs):
        plt.axvline(x.mean(), c='k', ls='-', lw=2.5)
        plt.axvline(x.median(), c='orange', ls='--', lw=2.5)
    
    g.map(specs,'body_mass_g' )
    

    Option 2

    • This option is more verbose, but more flexible in that it allows for accessing and adding information from a data source other than the one used to create the displot.
    import seaborn as sns
    import pandas as pd
    
    # load the data
    pen = sns.load_dataset("penguins")
    
    # groupby to get mean and median
    pen_g = pen.groupby('species').body_mass_g.agg(['mean', 'median'])
    
    g = sns.displot(
        data=pen, x='body_mass_g',
        col='species',  
        facet_kws=dict(sharey=False, sharex=False)
    )
    # extract and flatten the axes from the figure
    axes = g.axes.flatten()
    
    # iterate through each axes
    for ax in axes:
        # extract the species name
        spec = ax.get_title().split(' = ')[1]
        
        # select the data for the species
        data = pen_g.loc[spec, :]
        
        # print data as needed or comment out
        print(data)
        
        # plot the lines
        ax.axvline(x=data['mean'], c='k', ls='-', lw=2.5)
        ax.axvline(x=data['median'], c='orange', ls='--', lw=2.5)
    

    Output for both options

    enter image description here

    Resources