Search code examples
pythonpandasmatplotlibplotaxes

Pandas plot error : Missing category information for StrCategoryConverter; this might be caused by unintendedly mixing categorical and numeric data


I was trying to plot a line plot using pandas plot method, the same data with exactly same method runs fine if I use matplotlib methods, however if I use df.plot then annotate gives me error ValueError: Missing category information for StrCategoryConverter; this might be caused by unintendedly mixing categorical and numeric data

Say I have a dataframe,

data = {'Unit': {0: 'Admin ', 1: 'C-Level', 2: 'Engineering', 3: 'IT', 4: 'Manufacturing', 5: 'Sales'}, 'Mean': {0: 4.642857142857143, 1: 4.83, 2: 4.048, 3: 4.237317073170732, 4: 4.184319526627219, 5: 3.9904545454545453}}
result=pd.DataFrame(data)

When using matplotlib

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(18,9))                                  
ax.plot(results['Unit'],results['Mean'])

for i, val in enumerate(zip(results['Unit'],results['Mean'])):
    label = str(results.loc[i, 'Mean'])
    ax.annotate(label, val, ha='center')
                             
plt.show()

The above code works perfectly fine.

Using pandas plot function(which gives me error)

results.plot(x = 'Unit', y = 'Mean', marker = 'o', figsize=(8,5))
ax = plt.gca()
for i, val in enumerate(zip(results['Unit'],results['Mean'])):
    label = str(results.loc[i, 'Mean'])
    ax.annotate(label, val, ha='center')
plt.show()

which gives me error :

Traceback (most recent call last):
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\axis.py", line 1506, in convert_units
    ret = self.converter.convert(x, self.units, self)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\category.py", line 49, in convert
    raise ValueError(
ValueError: Missing category information for StrCategoryConverter; this might be caused by unintendedly mixing categorical and numeric data

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\backends\backend_qt.py", line 477, in _draw_idle
    self.draw()
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\backends\backend_agg.py", line 436, in draw
    self.figure.draw(self.renderer)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\artist.py", line 73, in draw_wrapper
    result = draw(artist, renderer, *args, **kwargs)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
    return draw(artist, renderer)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\figure.py", line 2837, in draw
    mimage._draw_list_compositing_images(
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
    return draw(artist, renderer)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\axes\_base.py", line 3091, in draw
    mimage._draw_list_compositing_images(
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\artist.py", line 50, in draw_wrapper
    return draw(artist, renderer)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\text.py", line 1969, in draw
    if not self.get_visible() or not self._check_xy(renderer):
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\text.py", line 1559, in _check_xy
    xy_pixel = self._get_position_xy(renderer)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\text.py", line 1552, in _get_position_xy
    return self._get_xy(renderer, x, y, self.xycoords)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\text.py", line 1419, in _get_xy
    x = float(self.convert_xunits(x))
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\artist.py", line 252, in convert_xunits
    return ax.xaxis.convert_units(x)
  File "C:\Users\hpoddar\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\axis.py", line 1508, in convert_units
    raise munits.ConversionError('Failed to convert value(s) to axis '
matplotlib.units.ConversionError: Failed to convert value(s) to axis units: 'Admin '

Expected output :

Annotated graph

Why am I getting error in case of pandas plot, and how can I resolve the same


Solution

  • matplotlib.axes.Axes.annotate() takes parameter xy as below:

    xy(float, float)

    The point (x, y) to annotate. The coordinate system is determined by xycoords.

    Therefore to fix the issue you could create a tuple that has (i, val[1]) rather than pass val which would contain string type.

    ax = results.plot(x = 'Unit', y = 'Mean', marker = 'o', figsize=(8,5))
    for i, val in enumerate(zip(results['Unit'],results['Mean'])):
        label = str(results.loc[i, 'Mean'])
        ax.annotate(text=label, xy=(i, val[1]), ha='center')
    

    Another option is to use matplotlib.axes.Axes.text():

    ax = results.plot(x = 'Unit', y = 'Mean', marker = 'o', figsize=(8,5))
    for i, val in enumerate(zip(results['Unit'],results['Mean'])):
        label = str(results.loc[i, 'Mean'])
        ax.text(x=i, y=val[1], s=label, ha='center')
    

    As you noted, when we call annotate() after matplotlib.axes.Axes.plot there is no ConversionError. But if we call annotate() after pandas.DataFrame.plot there is a ConversionError.

    My best guess is that after using matplotlib to plot there will be no conversion on annotate. But with a pandas plot a conversion will be attempted. This could be due to the representation of the x-ticks being different for both cases.

    To demonstrate, if we try the following code the same error will be thrown:

    fig, ax = plt.subplots(figsize=(18,9))
    # Call annotate without plotting! Throws ConversionError
    # ax.plot(results['Unit'],results['Mean'])
    
    for i, val in enumerate(zip(results['Unit'],results['Mean'])):
        label = str(results.loc[i, 'Mean'])
        ax.annotate(label, val, ha='center')
    

    Whereas if we do the following, no error will be thrown:

    fig, ax = plt.subplots(figsize=(18,9))
    # Call annotate without plotting! No error
    # ax.plot(results['Unit'],results['Mean'])
    
    for i, val in enumerate(zip(results['Unit'],results['Mean'])):
        label = str(results.loc[i, 'Mean'])
        ax.annotate(text=label, xy=(i, val[1]), ha='center')