Search code examples
pythonpandasmatplotlibdata-sciencegrouped-bar-chart

How to add error bars to a grouped bar plot


I would like to add error bar in my plot that I can show the min max of each plot. Please, anyone can help me. Thanks in advance.

The min max is as follow:

Delay = (53.46 (min 0, max60) , 36.22 (min 12,max 70), 83 (min 21,max 54), 17 (min 12,max 70)) Latency = (38 (min 2,max 70), 44 (min 12,max 87), 53 (min 9,max 60), 10 (min 11,max 77))

import matplotlib.pyplot as plt
import pandas as pd
from pandas import DataFrame
from matplotlib.dates import date2num
import datetime

Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency}, index=index)
ax = df.plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)
plt.savefig('TestX.png', dpi=300, bbox_inches='tight')
plt.show()

enter image description here


Solution

    • In order to plot in the correct location on a bar plot, the patch data for each bar must be extracted.
    • An ndarray is returned with one matplotlib.axes.Axes per column.
      • In the case of this figure, ax.patches contains 8 matplotlib.patches.Rectangle objects, one for each segment of each bar.
        • By using the associated methods for this object, the height, width, and x locations can be extracted, and used to draw a line with plt.vlines.
    • The height of the bar is used to extract the correct min and max value from dict, z.
      • Unfortunately, the patch data does not contain the bar label (e.g. Delay & Latency).
    • Tested in python 3.11, pandas 1.5.3, matplotlib 3.7.0
    import pandas as pd
    import matplotlib.pyplot as plt
    
    # create dataframe
    Delay = (53.46, 36.22, 83, 17)
    Latency = (38, 44, 53, 10)
    index = ['T=0', 'T=26', 'T=50','T=900']
    df = pd.DataFrame({'Delay': Delay, 'Latency': Latency}, index=index)
    
    # dicts with errors
    Delay_error = {53.46: {'min': 0,'max': 60}, 36.22: {'min': 12,'max': 70}, 83: {'min': 21,'max': 54}, 17: {'min': 12,'max': 70}}
    Latency_error = {38: {'min': 2, 'max': 70}, 44: {'min': 12,'max': 87}, 53: {'min': 9,'max': 60}, 10: {'min': 11,'max': 77}}
    
    # combine them; providing all the keys are unique
    z = {**Delay_error, **Latency_error}
    
    # plot
    ax = df.plot.bar(rot=0)
    plt.xlabel('Time')
    plt.ylabel('(%)')
    plt.ylim(0, 101)
    
    for p in ax.patches:
        x = p.get_x()  # get the bottom left x corner of the bar
        w = p.get_width()  # get width of bar
        h = p.get_height()  # get height of bar
        min_y = z[h]['min']  # use h to get min from dict z
        max_y = z[h]['max']  # use h to get max from dict z
        plt.vlines(x+w/2, min_y, max_y, color='k')  # draw a vertical line
    

    enter image description here

    • If there are non-unique values in the two dicts, so they can't be combined, we can select the correct dict based on the bar plot order.
    • All the bars for a single label are plotted first.
      • In this case, index 0-3 are the Dalay bars, and 4-7 are the Latency bars
    for i, p in enumerate(ax.patches):
        print(i, p)
        x = p.get_x()
        w = p.get_width()
        h = p.get_height()
        
        if i < len(ax.patches)/2:  # select which dictionary to use
            d = Delay_error
        else:
            d = Latency_error
            
        min_y = d[h]['min']
        max_y = d[h]['max']
        plt.vlines(x+w/2, min_y, max_y, color='k')