Search code examples
pythonpandasmatplotlibplotstacked

Plotting stacked bar


enter image description here

The left side of the figure represents the dataset. The right side of the figure represents the sketched bar that I want to produce. how can I plot these data to produce such sketched bar where X represents age groups and Y represents Length groups?

Any help would be great. Thanks in advance.


Solution

  • Use pd.cut() and pandas.DataFrame.groupby()

    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    
    # data
    random.seed(365)
    data = {'age': [np.random.randint(90) for _ in range(100)],
            'length': [np.random.randint(30) for _ in range(100)]}
    
    # dataframe
    df = pd.DataFrame(data)
    
    # age and days groups
    df['age_group'] = pd.cut(df['age'], bins=[0, 50, 60, 70, 80, 1000], labels=['<50', '50-59', '60-69', '70-79', '≥80'])
    df['days'] = pd.cut(df['length'], bins=[0, 6, 20, 1000], labels=['≤5 days', '6-20 days', '≥20 days'])
    
     age  length age_group       days
      72      22     70-79   ≥20 days
       2      14       <50  6-20 days
      14      12       <50  6-20 days
      47       4       <50    ≤5 days
      18      12       <50  6-20 days
    
    # groupby plot
    plt.figure(figsize=(16, 10))
    df.groupby(['age_group', 'days'])['days'].count().unstack().apply(lambda x: x*100/sum(x), axis=1).plot.bar(stacked=True)
    plt.legend(loc='center right', bbox_to_anchor=(-0.05, 0.5))
    plt.xlabel('Age Groups')
    plt.xticks(rotation=0)
    plt.gca().set_yticklabels(['{:.0f}%'.format(x) for x in plt.gca().get_yticks()]) 
    plt.show()
    

    enter image description here

    • This solution is similar to that provided by Quang Hoang, except it provides a y-axis from 0% - 100% while the other is normalized from 0 - 1.
    • The other solution could be used and then just format the y-axis with:
      • plt.gca().set_yticklabels(['{:.0f}%'.format(x*100) for x in plt.gca().get_yticks()])