I have been struggling with this for a few days and I feel like it's something really simple that I am missing. I thought I would reach out and hopefully someone can help me. All I am trying to do is plot a line graph from my dataset. I have tried many different methods but it keeps showing up with the same problem.
Here's my code.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('penalty_data_set_2.csv')
df['OFFENCE_MONTH'] = pd.to_datetime(df['OFFENCE_MONTH'])
print(df.head)
values = df.groupby('OFFENCE_MONTH')['TOTAL_NUMBER'].sum()
plt.plot(values.index,values)
here's what I get.
and the dataframe df
The dataset that I am using is from https://www.kaggle.com/llihan/australia-nsw-traffic-penalty-data-20112017
Can you check if the date is parsed as YYYY-MM-DD instead of YYYY-DD-MM?
It looks like matplotlib is putting all values on the first days of each year, which would be the case if the date was parsed wrong confusing days and months.