Search code examples
pythonpandasdataframebar-charterrorbar

Plot bar chart with errorbars from multiple columns in a dataframe


I am trying to do something that should be so simple but cannot find an answer via other's similar questions. I want to plot a bar graph of several groups of data that are stored in a dataframe with errorbar values that are also stored in the dataframe.

I have a dataframe that is coming from commercial software that has multiple columns I'd like to make into a clustered bar graph which I have only managed to do properly using df.plot.bar(). The issue I'm having now is just that I cannot figure out how to add error bars correctly from the same dataframe.

This code works fine to generate the type of plot I want from sample data in the same format:

df = pd.DataFrame()

#the groups can vary 
grp1 = 'a'
grp2 = 'b'
grp3 = 'c'

df['label'] = ['ID_1','ID_2','ID_3']
df[grp1+'_int'] = [5,5.5,6]
df[grp1+'_SD'] = [1,2,3]
df[grp2+'_int'] = [7,6,5]
df[grp2+'_SD'] = [2,1,1.5]
df[grp3+'_int'] = [6.5,5,5.5]
df[grp3+'_SD'] = [1.5,1.5,2]

ax = df.plot.bar(x='label', y=[grp1+'_int',grp2+'_int',grp3+'_int'])
plt.show()

How can I add errorbars (positive only is fine, but really any errorbars) from the corresponding *_SD columns?

Edit: the issue seems to be related to the number of rows in my real dataframe. Here is an example of a working and non-working test code:

Not Working (throws ValueError: err must be [ scalar | N, Nx1 or 2xN array-like ]):

df = pd.DataFrame()

#the groups can vary 
grp1 = 'a'
grp2 = 'b'
grp3 = 'c'

df['label'] = ['ID_1','ID_2','ID_3','ID_4']
df[grp1+'_int'] = np.linspace(1,10,4)
df[grp1+'_SD'] = np.linspace(1,2,4)
df[grp2+'_int'] = np.linspace(2,8,4)
df[grp2+'_SD'] = np.linspace(1.5,3,4)
df[grp3+'_int'] = np.linspace(0.5,9,4)
df[grp3+'_SD'] = np.linspace(1,8,4)
print(df)
ax = df.plot.bar(x='label', y=[grp1+'_int',grp2+'_int',grp3+'_int'], yerr=df[[grp1+'_SD', grp2+'_SD', grp3+'_SD']].values)
plt.show()

Working:

df = pd.DataFrame()

#the groups can vary 
grp1 = 'a'
grp2 = 'b'
grp3 = 'c'

df['label'] = ['ID_1','ID_2','ID_3']
df[grp1+'_int'] = np.linspace(1,10,3)
df[grp1+'_SD'] = np.linspace(1,2,3)
df[grp2+'_int'] = np.linspace(2,8,3)
df[grp2+'_SD'] = np.linspace(1.5,3,3)
df[grp3+'_int'] = np.linspace(0.5,9,3)
df[grp3+'_SD'] = np.linspace(1,8,3)
print(df)
ax = df.plot.bar(x='label', y=[grp1+'_int',grp2+'_int',grp3+'_int'], yerr=df[[grp1+'_SD', grp2+'_SD', grp3+'_SD']].values)
plt.show()

Solution

  • Updated to add T to transpose the np.array for yerr parameter.

    Try this:

    df = pd.DataFrame()
    
    #the groups can vary 
    grp1 = 'a'
    grp2 = 'b'
    grp3 = 'c'
    
    df['label'] = ['ID_1','ID_2','ID_3']
    df[grp1+'_int'] = [5,5.5,6]
    df[grp1+'_SD'] = [1,2,3]
    df[grp2+'_int'] = [7,6,5]
    df[grp2+'_SD'] = [2,1,1.5]
    df[grp3+'_int'] = [6.5,5,5.5]
    df[grp3+'_SD'] = [1.5,1.5,2]
    
    ax = df.plot.bar(x='label', 
                    y=[grp1+'_int',grp2+'_int',grp3+'_int'],
                    yerr=df[['a_SD','b_SD','c_SD']].T.values)
    

    Output:

    enter image description here