I have created a boxplot using seaborn and matplotlib and I have added the means by setting showmeans to true like so:
sns.boxplot(data=df, x="Job", y="Age", order=ylabels, showmeans=True)
However, the means do not show up in the legend. I would like them to show up in the legend so they don't just appear to be random points.
First thing I tried was the obvious:
plt.legend()
which yielded no change. I then tried looking into changing the meanprops values, but this did not change anything. I also created a separate visualization using subplots to create another data set for the means and layer it on. I did not put a crazy amount of time into that, because I know with enough tinkering I can get it to work. That is not my question though, I am trying to avoid doing that.
sns.boxplot()
accepts most parameters of ax.boxplot()
. One of those parameters is showmeans=
. Another is meanprops=
, which is a dictionary of properties to change the mean. One of those properties is a label
.
Unfortunately, setting the label
via the meanprops
will assign the label to each of the means. If there are 4 box plots, the mean would show up 4 times in the legend. You can grab the list of handles and labels for the legend via ax.get_legend_handles_labels()
, and only use part of them. The code needs to be adapted if there are other elements in the legend.
import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset('tips')
sns.set_style('white')
ax = sns.boxplot(data=tips, x='day', y='tip', color='salmon',
showmeans=True, meanprops={'label': 'mean'})
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles[:1], labels[:1])
plt.show()
Alternatively, you could add the means via sns.pointplot()
, suppressing the lines and error bars. The following example draws horizontal box plots:
import matplotlib.pyplot as plt
import seaborn as sns
flights = sns.load_dataset('flights')
sns.set_style('white')
ax = sns.boxplot(data=flights, y='month', x='passengers', color='turquoise', orient='h')
sns.pointplot(data=flights, y='month', x='passengers', label='mean',
errorbar=None, markers='D', ls='', orient='h', ax=ax)
plt.show()