Search code examples
pythonmatplotlibplotseaborn

addling the difference value of stripplot groups in seaborn


I have used this example of stripplot of seaborn

[ax = sns.stripplot(x="day", y="total_bill", hue="smoker",
                   data=tips, palette="Set2", dodge=True)][1]

My data is shown as follows:

enter image description here

I have calculated the minimum and maximum among all three groups (12,13, and 14)for each color separately, because I want to show the differences between each colour among three groups. I wanna show their difference value on the strip plot with their corresponding color in the stripplot, is there any way? or with a vertical line beside the stripplot that can show the difference value between minimum and maximum value of each color among all?

In picture, the minimum and maximum I marked manually with the arrow.

for example the the min and max values are as follows:

    {'min_320': 29.977999999084474, 'max_320': 35.66800000610352}
{'min_384': 33.4459999987793, 'max_384': 36.849999999389645}
{'min448': 34.49400000915527, 'max_448': 37.29800001159668}

Solution

  • The approach below creates bar plots to show the range for each hue value. Bars are first created using the maximum of each hue category. Then, the bottoms are changed to the minimum values.

    import matplotlib.pyplot as plt
    import seaborn as sns
    import numpy as np
    
    tips = sns.load_dataset('tips')
    # set pd.Categorical type to fix an order
    tips["day"] = tips["day"].astype('category')
    tips["smoker"] = tips["smoker"].astype('category')
    
    ax = sns.stripplot(x="day", y="total_bill", hue="smoker", data=tips, palette="Set2", dodge=True)
    
    day_cats = tips["day"].cat.categories
    smoker_cats = tips["smoker"].cat.categories
    
    # find maximum and minimum of each hue category
    max_values = tips.groupby("smoker", observed=True).agg({'total_bill': 'max'}).values.flatten()
    min_values = tips.groupby("smoker", observed=True).agg({'total_bill': 'min'}).values.flatten()
    
    # create bar plots with the maximum values as height, repeat for each x value
    for day in day_cats:
        sns.barplot(x=[day] * len(smoker_cats), y=max_values, hue=smoker_cats, palette="Set2",
                    fill=False, legend=False, ax=ax)
    # loop through the bars and set the bottom of the bars to the minimum of its hue category
    for bars, bottom in zip(ax.containers, np.tile(min_values, len(day_cats))):
        for bar in bars:
            bar.set_y(bottom)
            bar.set_height(bar.get_height() - bottom)
    plt.show()
    

    seaborn stripplot with boxes to show the range