Search code examples
pythonseabornmeanmodeviolin-plot

How to create seaborn violinplot with mean,median and mode displayed?


Is there a way to add a mean and a mode to a violinplot ? I have categorical data in one of my columns and the corresponding values in the next column. I tried looking into matplotlib violin plot as it technically offers the functionality I am looking for but it does not allow me to specify a categorical variable on the x axis, and this is crucial as I am looking at the distribution of the data per category. I have added a small table illustrating the shape of the data.

plt.figure(figsize=10,15)
ax=sns.violinplot(x='category',y='value',data=df) 

enter image description here


Solution

  • First we calculate the the mode and means:

    import seaborn as sns
    import pandas as pd
    from matplotlib import pyplot as plt
    
    df = pd.DataFrame({'Category':[1,2,5,1,2,4,3,4,2],
                       'Value':[1.5,1.2,2.2,2.6,2.3,2.7,5,3,0]})
    
    Means = df.groupby('Category')['Value'].mean()
    Modes = df.groupby('Category')['Value'].agg(lambda x: pd.Series.mode(x)[0])
    

    You can use seaborn to make the basic plot, below I remove the inner boxplot using the inner= argument, so that we can see the mode and means:

    fig, ax = plt.subplots()
    sns.violinplot(x='Category',y='Value',data=df,inner=None)
    plt.setp(ax.collections, alpha=.3)
    plt.scatter(x=range(len(Means)),y=Means,c="k")
    plt.scatter(x=range(len(Modes)),y=Modes)
    

    enter image description here