Search code examples
pythonpandasmatplotlibhistogram

Histogram using plot in Pandas - set x label


Dataframe:


Horror films released in 2019
Title           Director            Country             Year
3 from Hell     Rob Zombie          United States   2019
Bliss           Joe Begos           United States   2019
Bedeviled       The Vang Brothers   United States   2016
Creep 2         Patrick Brice       United States   2017
Brightburn      David Yarovesky     United States   2019
Delirium        Dennis Iliadis      Ireland         2018
Child's Play    Lars Klevberg       United States   2019
The Conjuring 2 James Wan           United States   2016
Bloodlands      Steven Kastrissios  Albania         2017
Bird Box        Susanne Bier        United States   2017

need to plot a histogram showing the number of titles released over the years using Pandas plot function

code:

df=pd.read_csv(filename)
group = df.groupby('Year').count()[['Title']]
new_df = grouped.reset_index()
xtick=newdf['Year'].tolist()
width = newdf.Year[1] - newdf.Year[0]
newdf.iloc[:,1:2].plot(kind='bar', width=width)

Cannot figure out a way to label x axis with values from the Year column, also unsure if my approach is correct.

Thanks in advance :)


Solution

  • It sounds like you want a bar chart, not a histogram, because you have discrete/categorical variables (years). And you say "kind=bar" in your plot statement, so you are on the right track. Try this to see if it works for you. I forced the y-axis to be integers since you are looking for counts, but that is optional.

    import pandas as pd
    import matplotlib.pyplot as plt
    
    title = [ 'Movie1','Movie2','Movie3',
            'Movie4','Movie5','Movie6',
            'Movie7','Movie8','Movie9',
    ]
    
    year = [2019,2019,2018,
            2017,2019,2018,
            2019,2017,2018
    ]
    
    df = pd.DataFrame(list(zip(title, year)), 
                      columns =['Title', 'Year']
                     )
    
    print(df)
    group = df.groupby('Year').count()[['Title']]\
            .rename(columns={'Title': 'No. of Movies'})\
            .reset_index()
    print(group)
    
    ax = group.plot.bar(x='Year', rot=0)
    ax.yaxis.get_major_locator().set_params(integer=True)
    plt.show()
    

        Title  Year
    0  Movie1  2019
    1  Movie2  2019
    2  Movie3  2018
    3  Movie4  2017
    4  Movie5  2019
    5  Movie6  2018
    6  Movie7  2019
    7  Movie8  2017
    8  Movie9  2018
    

       Year  No. of Movies
    0  2017              2
    1  2018              3
    2  2019              4
    

    enter image description here