Search code examples
pythonpandasmatplotlibplotlinegraph

Plotting line graphs in matplotlib with correct dates


I have been struggling with this for a few days and I feel like it's something really simple that I am missing. I thought I would reach out and hopefully someone can help me. All I am trying to do is plot a line graph from my dataset. I have tried many different methods but it keeps showing up with the same problem.

Here's my code.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('penalty_data_set_2.csv')
df['OFFENCE_MONTH'] = pd.to_datetime(df['OFFENCE_MONTH'])
print(df.head)


values = df.groupby('OFFENCE_MONTH')['TOTAL_NUMBER'].sum()


plt.plot(values.index,values)

here's what I get.

Line graph plot

and the dataframe df

The dataset that I am using is from https://www.kaggle.com/llihan/australia-nsw-traffic-penalty-data-20112017


Solution

  • Can you check if the date is parsed as YYYY-MM-DD instead of YYYY-DD-MM?

    It looks like matplotlib is putting all values on the first days of each year, which would be the case if the date was parsed wrong confusing days and months.