Search code examples
pythonpandasmatplotlibdata-analysis

How to combine two year rows as range in pandas and plot its graph with another column


I have a dataset having ageFrom, ageTo and sales column. Now I want to plot a graph with age range in one axis and sales in another axis to find the sales based on age range?

enter image description here

I have grouped by age from, age to and found the sum of the sales amount. Now I need to plot on ageFrom and ageTo as range in my x axis and Total Amount as Y axis. How to approach this?


Solution

  • Now i'm not sure what you have/have not tried, however the simplest way to do it is just create a mean column (to avoid having non numeric values if you want to create a range) for your range and then plot it against your sales. The below should do it:

    import pandas as pd
    
    #data
    d = {'ageFrom': [0, 11, 21, 31], 'ageTo': [10, 20, 30, 40], 'sales':[2, 10, 15, 20]}
    #create dataframe
    df = pd.DataFrame(data = d)
    
    #create a mean value for your range
    df['mean'] = (df['ageFrom']  + df['ageTo'])/2
    #plot data
    ax1 = df.plot.scatter(x='mean',y='sales')
    

    enter image description here

    OR: you can still create a range column and plot it as a bar chart as below:

    #create a range column
    df['range'] = df['ageFrom'].astype(str)  + '-' + df['ageTo'].astype(str)
    #plot bar
    ax2 = df.plot.bar(x='range', y='sales', rot=0)
    

    enter image description here

    Finally you can still use non numeric for your x-axis using matplotlib like below:

    import matplotlib.pyplot as plt
    
    plt.scatter(df.range,df['sales'],c='b',label='range')
    

    enter image description here