Search code examples
matplotlibseabornvisualizationscatter-ploterrorbar

Adding error bars to seaborn scatter plot (when a line plot is combined)


I have a scatter + lineplot in seaborn, created in this way:

import seaborn as sns
import pandas as pd

# load sample data from seaborn
flights = sns.load_dataset('flights')

fig_example = plt.figure(figsize=(10, 10))
sns.lineplot(data=flights, x="year", y="passengers", hue="month")
sns.scatterplot(data=flights, x="year", y="passengers", hue="month",legend=False)

enter image description here

Now, I want to add error bars. For example, the first entry point is (year=1949, passengers=112). I want to add for this specific item, an std. For example: += 5 passengers. How can I do it?

This question does not answer my question: How to use custom error bar in seaborn lineplot

I need to add it to scatterplot. Not to the line plot.

When I try this command:

ax = sns.scatterplot(x="x", y="y", hue="h", data=gqa_tips, s=100, ci='sd', err_style='bars')

It fails:

AttributeError: 'PathCollection' object has no property 'err_style'

Solution

    • This question seems to display a misunderstanding of error bars / confidence interval (ci).
      • Specifically, ...the first entry point...I want to add for this specific item, an std
    • It's an incorrect statistical representation to put an error bar on an individual data point, because these individual data points do not have an error, at least not as it relates to the question.
    • Each point in the plot does not have an error, because it is an exact value.
      • The aggregated values (e.g. the mean values) have a ci in relation to all of the real data points.
    • Aggregated values, produced in a lineplot without hue, will use estimator='mean', which will then have a ci.
    • Refer back to How to use custom error bar in seaborn lineplot for customizing the ci.
    • The errorbar parameter should be used instead of ci.
    import pandas as pd
    import seaborn as sns
    
    # load the data
    flights = sns.load_dataset('flights')
    
    # plots
    fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=(18, 7))
    sns.lineplot(data=flights, x="year", y="passengers", marker='o', ci=95, ax=ax1, label='Mean CI: 95')
    ax1.set(title='Mean Passengers per Year')
    
    sns.lineplot(data=flights, x="year", y="passengers", ci='sd', err_style='bars', ax=ax2, label='Mean CI: sd')
    flights.groupby('year').passengers.agg([min, max]).plot(ax=ax2)
    ax2.set(title='Mean Min & Max Passengers per Year')
    
    sns.lineplot(data=flights, x="year", y="passengers", hue="month", marker='o', ax=ax3)
    ax3.set(title='Individual Passengers per Month\nNo CI for Individual Points')
    

    enter image description here

    enter image description here