Search code examples
pandaslistdataframematplotlibpie-chart

Matplotlib issue match # of values to # of labels -- ValueError: 'label' must be of length 'x'


I have a df called high that looks like this:

   white  black  asian  native  NH_PI  latin
0  10239  26907   1079     670     80   1101`

I'm trying to create a simple pie chart with matplotlib. I've looked at multiple examples and other SO pages like this one, but I keep getting this error:

Traceback (most recent call last):
  File "I:\Sustainability & Resilience\Food Policy\Interns\Lara Haase\data_exploration.py", line 62, in <module>
    plt.pie(sizes, explode=None, labels = high.columns, autopct='%1.1f%%', shadow=True, startangle=140)
  File "C:\Python27\ArcGIS10.6\lib\site-packages\matplotlib\pyplot.py", line 3136, in pie
    frame=frame, data=data)
  File "C:\Python27\ArcGIS10.6\lib\site-packages\matplotlib\__init__.py", line 1819, in inner
    return func(ax, *args, **kwargs)
  File "C:\Python27\ArcGIS10.6\lib\site-packages\matplotlib\axes\_axes.py", line 2517, in pie
    raise ValueError("'label' must be of length 'x'")
ValueError: 'label' must be of length 'x'`

I've tried multiple different ways to make sure the labels and values match up. There are 6 of each, but I can't understand why Python disagrees with me.

Here is one way I've tried:

plt.pie(high.values, explode=None, labels = high.columns, autopct='%1.1f%%', shadow=True, startangle=140)

And another way:

labels = list(high.columns)
sizes = list(high.values)
plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)`

Also have tried with .iloc:

labels = list(high.columns)
sizes = high.loc[[0]]
print(labels)
print(sizes)
plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)

But no matter what I've tried, I keep getting that same key error. Any thoughts?


Solution

  • Just to expand on @ScottBoston's post,

    Plotting a pie chart from a data frame with one row is not possible unless you reshape the data into a single column or series.

    An operation I typically use is .stack(),

    df = df.stack()
    

    .stack() is very similar to .T, but returns a series with the column names as a second index level. This is handy when you have multiple rows and want to retain the original indexing. The result of df.stack() is:

    0  white     10239
       black     26907
       asian      1079
       native      670
       NH_PI        80
       latin      1101
    dtype: int64
    

    After I stack() a data frame, I typically assign a name to a series using:

    df.name = 'Race'
    

    Setting a name is not required, but helps when you are actually trying to plot the data using pd.DataFrame.plot.pie.

    If the data frame df had more than one row of data, you could then plot pie charts for each row using .groupby

    for name, group in df.groupby(level=0):
        group.index = group.index.droplevel(0)
        group.plot.pie(autopct='%1.1f%%', shadow=True, startangle=140)
    

    Since the first level of the index only provides the positional index from the input data, I drop that level to make the labels on the plot appear as desired.

    enter image description here

    If you don't want to use pandas to make the pie chart, this worked for me:

    plt.pie(df.squeeze().values, labels=df.columns.tolist(),autopct='%1.1f%%', shadow=True, startangle=140)
    

    This attempt didn't work because high.columns is not list-like.

    #attempt 1
    plt.pie(high.values, explode=None, labels = high.columns, autopct='%1.1f%%', shadow=True, startangle=140)
    

    This attempt didn't work because list(high.values) returns a list with an array as the first element.

    #attempt 2
    labels = list(high.columns)
    sizes = list(high.values)
    plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)
    

    The last attempt didn't work because high.loc[[0]] returns a dataframe. Matplotlib does not know parse a dataframe as an input.

    labels = list(high.columns)
    sizes = high.loc[[0]]
    print(labels)
    print(sizes)
    plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)