Search code examples
python-3.xpandastime-seriespandas-groupby

Month name from pandas series datetime index using Grouper


I'm plotting a year's data (June - May) to a whisker-box plot, by month.

I have the data in a pandas series:

Date
2018-06-01    0.012997
2018-06-02    0.009615
2018-06-03    0.012884
2018-06-04    0.013358
2018-06-05    0.013322
2018-06-06    0.011532
2018-06-07    0.018297
2018-06-08    0.018820
2018-06-09    0.031254
2018-06-10    0.016529
...
Name: Value, dtype: float64

I can plot it but I'm not able to get the column the month name so it's plotted with rather just numbers. However as the months are not from Jan = Dec, the month number doesn't make sense this way.

Any way to get the month name when I create such a df using the Grouper function?

The code I'm using is originally from https://machinelearningmastery.com/time-series-data-visualization-with-python/

If I understand correctly, the Grouper arranges the series into an array that contains the data per month, so I guess that would be the point when it would be possible (if at all):

groups = series.groupby(pd.Grouper(freq = 'M'))
months = pd.concat([pd.DataFrame(x[1].values) for x in groups], axis=1)

I tried to find but couldn't get any hint on how to name a column based on any condition when using the pd.DataFrame function. I would be really grateful if anyone could help me with the right direction.

fig = plt.figure(figsize = (16,8))

#some code for other plots

ax3 = fig.add_subplot(212)
groups = series.groupby(pd.Grouper(freq = 'M'))
months = pd.concat([pd.DataFrame(x[1].values) for x in groups], axis=1)
months = pd.DataFrame(months)
months.columns = range(1,13)
months.boxplot()
ax3.set_title('Results June 2018 - May 2019')

plt.show()

Solution

  • You can use the strftime function with the '%B' conversion string to obtain the corresponding month names and then plot them.

    Here's some example code:

    series = pd.Series({'2018-06-01':0.012997,
    '2018-06-02':0.009615,
    '2018-07-03':0.012884,
    '2018-06-04':0.013358,
    '2018-08-05':0.013322,
    '2018-09-06':0.011532,
    '2018-10-07':0.018297,
    '2018-11-08':0.018820,
    '2018-12-09':0.031254,
    '2018-06-10':0.016529})
    
    series.index = pd.to_datetime(series.index)
    
    fig = plt.figure(figsize = (16,8))
    
    ax3 = fig.add_subplot(212)
    group = series.groupby(pd.Grouper(freq = 'M')).sum()
    plt.bar(group.index.strftime('%B'), group)
    
    ax3.set_title('Results June 2018 - May 2019')
    
    plt.show()
    

    And here's the corresponding plot it produces:

    Graph with month names