Search code examples
pythonpandashistogram

Removing Year Decimals from x-axis in Pandas Histogram


I am trying to plot the histogram of my data by year; however, the x-axis 'Year' is shown in decimal format (i.e. (2020.0, 2020.25 ...) not integer format.

Histogram

1

I extracted the year from the date string using

df['Year'] = pd.DatetimeIndex(df['INCI_DATE']).year

then plotted using pandas hist command

hist = df.Year.hist(bins=3, figsize=(10,10))

I tried converting the 'Year' column to int, but get the same results. df.Year = df.Year.astype(int)

How can I get have the years in integer format only?


Solution

  • If you want to see just the years, don't try change it to year only format. Leave it as datetime and plot it. Matplotlib can show it in the format you need. You can use set_major_locator() and set_major_formatter() for this. Below is the updated code... Hope this is what you are looking for...

    ## You can update INCI_DATE as datetime, or create a new column... 
    df['Year'] = pd.DatetimeIndex(df['INCI_DATE'])#.year
    hist = df.Year.hist(bins=3, figsize=(10,4)) 
    
    import matplotlib.dates as mdates
    hist.xaxis.set_major_locator(mdates.YearLocator())   #to get a tick every 1 year
    hist.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))     #Show only Year
    

    Output plot

    enter image description here