Search code examples
pythonmatplotlibplotseabornpython-ggplot

Plotting over groups of values in Python


I have a dataframe that looks like this.

        country  age  new_user  
298408      UK   32         1   
193010      US   37         0 
164494      UK   17         0   
28149       US   34         0  
297080   China   29         1    

I want to plot the count of new_users for the age groups (20-30, 30-40 and so on) for each country in a single graph in Python.

Basically, I need to plot new_user(value 0) for all the age groups and new_user(value 1) for all the age groups for all the countries.

I am finding it hard to group the ages into 20-30,30-40 and so on. Can someone please help me plot this using either seaborn or ggplot or matplotlib in python? ggplot is preferrable!

Thank you.


Solution

  • import seaborn as sns
    from pandas import DataFrame
    from matplotlib.pyplot import show, legend
    d = {"country": ['UK','US','US','UK','PRC'],
           "age": [32, 37, 17, 34, 29],
           "new_user": [1, 0, 0, 0,1]}
    
    df = DataFrame(d)
    bins = range(0, 100, 10)
    ax = sns.distplot(df.age[df.new_user==1],
                  color='red', kde=False, bins=bins, label='New')
    sns.distplot(df.age[df.new_user==0],
             ax=ax,  # Overplots on first plot
             color='blue', kde=False, bins=bins, label='Existing')
    legend()
    show()
    

    enter image description here