I am trying to plot a graph that shows changes in monthly values with matplotlib (python) and pandas, and I am having struggles with it as I am new. Apologies in advance for the formatting.
Here is a sample of dataframe:
Year Month Value
2014 January 510
2014 February 542
2014 March 684
2014 April 700
2014 May 732
2014 July 603
2014 August 643
2014 September 680
2014 October 723
2014 November 760
2014 December 810
2015 January 920
2015 February 900
2015 March 780
2015 April 710
2015 May 810
2015 July 895
2015 August 906
2015 September 945
2015 October 980
2015 November 1000
2015 December 1123
Here's what i tried (shortened version):
import matplotlib.pyplot as plt
plt.title('Monthly data over several years')
plt.plot(df['Month'].to_list(), df['Value'].to_list())
This would result in the plotted values for each month in 2015 on the Y axis going back to the previous year's respective month instead of going on the Y axis of 2015's Jan-Dec. For example, 2015 January's value of 920 would go right back to 2014 January on the Y axis.
I also tried:
months = {"January": 1, "February": 2, "March": 3, "April": 4, "May": 5, "June": 6,
"July": 7, "August": 8, "September": 9, "October": 10, "November": 11, "December": 12}
df{'Month'] = df.Month.map(months)
df['Date'] = df['Year'].map(str) + '-' + df['Month'].map(str)
This was to concatenate the month with the respective year to avoid the previous problem, however, I just ended up with NaN values in the df['Months'] column. Furthermore, I feel like it would overpopulate the xticks (or labels??) on the x axis, if it did work.
What I want is a time-series like plot graph showing the monthly changes in the values for the two years. I'm struggling to construct the xticks neatly on the graph. Otherwise, would a bar graph work better?
It is much easier if you split Year/month columns to separate series for each year
import pandas as pd
import matplotlib.pyplot as plt
fig, axes = plt.subplots(figsize=(6,4))
df = pd.read_csv("data.csv")
df2 = pd.pivot_table(df, index="Month", columns=["Year"])
df2 = df2.reindex(['January', 'February', 'March', 'April', 'May', 'July', 'August', 'September', 'October', 'November', 'December'])