Search code examples
pythonmatplotliberrorbar

How to add errorbars to grouped barplot with matplotlib?


I'm currently analyzing earthworm count data collected within an on-field experiment. My variables are style of site (experimental site vs. reference site), the year the data has been collected, the crop that was cultivated and the number of earthworms. I use groupby() to group the earthworms by style, year and crop and display it in a barchart.

But when I'm trying to add the respective std deviation, it gives me the following error:

ValueError: 'yerr' (shape: (8,)) must be a scalar or a 1D or (2, n) array-like whose shape matches 'y' (shape: (4,))

I can't solve this, I'm pretty new to Python (and stackoverflow for that matter). Any tips would be greatly appreciated!

Here's my code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

data = {'Style': ["Experiment","Reference", "Experiment", "Reference", "Experiment","Reference",
                  "Experiment", "Reference"],
        'Year': ["2021", "2021","2022","2022", "2021","2021", "2022", "2022"],
        'Crop': ["Rapeseed", "Rapeseed", "Rapeseed", "Rapeseed",
                 "Maize", "Maize", "Maize", "Maize"],
        'Earthworms': [55, 2, 2,6,0,1,7,22]
       }

df = pd.DataFrame(data)

#Set graph properties
fig, ax = plt.subplots(figsize=(15,7))
colors = {"Maize": "#de8f05", "Rapeseed":"#d7bb19"}         
labels = list(colors.keys())


#Create yerr variable
yerr = [10.6926766215636, 1.4142135623731, 0.577350269189626,1.414213562, 0,
        0.707106781186548, 2.857738033, 4.43471156521669]


#Groupby Year, Patchstyle, Crop (ind. variables), EW_num (dep. variable)
df = df.groupby(["Year", "Style", "Crop"])["Earthworms"].sum().unstack().plot.bar(ax=ax, color=colors,     yerr=yerr)

#Assign labels, axis ticks + limit, hide spines  

plt.ylabel("N", size=13, labelpad=10)
plt.yticks(fontsize=12)
plt.xticks(fontsize=12)
ax.set(xlabel=None)

plt.ylim(0,60)    
ax.spines.right.set_visible(False)
ax.spines.top.set_visible(False)
ax.margins(0.2,0)

Solution

  • As the error says, the issue is with the shape of yerr. Looking at the result of your data frame manipulation before plotting:

    print(df.groupby(["Year", "Style", "Crop"])["Earthworms"].sum().unstack())
    
    Crop             Maize  Rapeseed
    Year Style                      
    2021 Experiment      0        55
         Reference       1         2
    2022 Experiment      7         2
         Reference      22         6
    

    So, we'll have 2 bars per year/style combination (because there are two crops). Therefore, pandas was expecting either a single yerr value to apply to all the bars or a specific yerr for each bar, but that needs to be shaped in (2,4) (as the error message suggested)

    I don't know which errors correspond to which bar, but doing a simple reshape of yerr = np.array(yerr).reshape(2,4) and running the code produces a plot without error. I'll leave it to you to figure out how to create yerr so the correct errors go to the correct bars. (Though I'll add that when reshaping like this, the first 4 values go to the first row and the next 4 to the second row. Based on the 0 being in the 5th spot and the first yellow bar having no error bars seems to indicate that it applies the first row to one crop and then the next row to the next crop.)