Search code examples
pythonmatplotlibjupyter-notebookseabornboxplot

How to add a label for the mean values in a sns.boxplot() when showmeans is set to True?


I have created a boxplot using seaborn and matplotlib and I have added the means by setting showmeans to true like so: sns.boxplot(data=df, x="Job", y="Age", order=ylabels, showmeans=True) However, the means do not show up in the legend. I would like them to show up in the legend so they don't just appear to be random points.

First thing I tried was the obvious: plt.legend() which yielded no change. I then tried looking into changing the meanprops values, but this did not change anything. I also created a separate visualization using subplots to create another data set for the means and layer it on. I did not put a crazy amount of time into that, because I know with enough tinkering I can get it to work. That is not my question though, I am trying to avoid doing that.


Solution

  • sns.boxplot() accepts most parameters of ax.boxplot(). One of those parameters is showmeans=. Another is meanprops=, which is a dictionary of properties to change the mean. One of those properties is a label.

    Unfortunately, setting the label via the meanprops will assign the label to each of the means. If there are 4 box plots, the mean would show up 4 times in the legend. You can grab the list of handles and labels for the legend via ax.get_legend_handles_labels(), and only use part of them. The code needs to be adapted if there are other elements in the legend.

    import matplotlib.pyplot as plt
    import seaborn as sns
    
    tips = sns.load_dataset('tips')
    sns.set_style('white')
    ax = sns.boxplot(data=tips, x='day', y='tip', color='salmon',
                     showmeans=True, meanprops={'label': 'mean'})
    handles, labels = ax.get_legend_handles_labels()
    ax.legend(handles[:1], labels[:1])
    plt.show()
    

    sns.boxplot with mean in the legend

    Alternatively, you could add the means via sns.pointplot(), suppressing the lines and error bars. The following example draws horizontal box plots:

    import matplotlib.pyplot as plt
    import seaborn as sns
    
    flights = sns.load_dataset('flights')
    sns.set_style('white')
    ax = sns.boxplot(data=flights, y='month', x='passengers', color='turquoise', orient='h')
    sns.pointplot(data=flights, y='month', x='passengers', label='mean',
                  errorbar=None, markers='D', ls='', orient='h', ax=ax)
    plt.show()
    

    sns.boxplot with sns.pointplot for the means