Search code examples

Plotting stacked bar

enter image description here

The left side of the figure represents the dataset. The right side of the figure represents the sketched bar that I want to produce. how can I plot these data to produce such sketched bar where X represents age groups and Y represents Length groups?

Any help would be great. Thanks in advance.


  • Use pd.cut() and pandas.DataFrame.groupby()

    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    # data
    data = {'age': [np.random.randint(90) for _ in range(100)],
            'length': [np.random.randint(30) for _ in range(100)]}
    # dataframe
    df = pd.DataFrame(data)
    # age and days groups
    df['age_group'] = pd.cut(df['age'], bins=[0, 50, 60, 70, 80, 1000], labels=['<50', '50-59', '60-69', '70-79', '≥80'])
    df['days'] = pd.cut(df['length'], bins=[0, 6, 20, 1000], labels=['≤5 days', '6-20 days', '≥20 days'])
     age  length age_group       days
      72      22     70-79   ≥20 days
       2      14       <50  6-20 days
      14      12       <50  6-20 days
      47       4       <50    ≤5 days
      18      12       <50  6-20 days
    # groupby plot
    plt.figure(figsize=(16, 10))
    df.groupby(['age_group', 'days'])['days'].count().unstack().apply(lambda x: x*100/sum(x), axis=1)
    plt.legend(loc='center right', bbox_to_anchor=(-0.05, 0.5))
    plt.xlabel('Age Groups')
    plt.gca().set_yticklabels(['{:.0f}%'.format(x) for x in plt.gca().get_yticks()])

    enter image description here

    • This solution is similar to that provided by Quang Hoang, except it provides a y-axis from 0% - 100% while the other is normalized from 0 - 1.
    • The other solution could be used and then just format the y-axis with:
      • plt.gca().set_yticklabels(['{:.0f}%'.format(x*100) for x in plt.gca().get_yticks()])