Search code examples
pythonmatplotlibcategorical-datapython-datetime

Avoid interpreting strings as dates when plotting categorical data in matplotlib


I am trying to plot categorical data in matplotlib with string entries that look like dates but are not dates. Matplotlib tries to automatically convert the string to a datetime object, but fails. How can I force matplotlib to treat the categories as strings and prevent it from trying to convert the string to a datetime object?

Here's my example:

import matplotlib.pyplot as plt
categories = ['2019-20', '2020-21']
vals = [5, 10]
plt.plot(categories, vals)

Which gives

ValueError: could not convert string to float: '2019-20'
<...snip...>
calendar.IllegalMonthError: bad month number 20; must be 1-12

For what it's worth, in my example, the strings represent academic years (2019-2020 and 2020-2021), but matplotlib assumes that they are dates in the form YYYY-MM and throws and error when trying to convert "20" and "21" to a valid month.

If I change the categories to ['2019-2020', '2020-2021'], the code works fine (matplotlib no longer assumes the strings represent a datetime object).

import matplotlib.pyplot as plt
categories = ['2019-2020', '2020-2021']
vals = [5, 10]
plt.plot(categories, vals)

enter image description here

But I prefer to use the shorter version YYYY-YY rather than the longer YYYY-YYYY.


Solution

  • plt.plot will try to do a X-Y Cartesian coordinate type plot if you pass in two positional args. I think you need something like plot(vals) only and then call plt.xticks:

    import matplotlib.pyplot as plt
    import numpy as np
    categories = ['2019-20', '2020-21']
    vals = [5, 10]
    plt.plot(vals)
    plt.xticks(np.arange(len(vals)), tuple(categories))
    

    refer to https://matplotlib.org/api/_as_gen/matplotlib.pyplot.xticks.html