Search code examples
pythonpandasmatplotlibseabornstacked-chart

How to add annotations to stacked percentage bar plot


I want to add values to stacked bar chart using matplotlib. So far I've been able to create the stacked bar chart, but I am confused on how to add the annotations.

A similar question has been answered here, but for ggplot.

I want an output similar to that not the entire graph but just the annotations in middle.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt


data = {'Range':['<10','>10', '>= 20', '<10','>10', '>= 20', '<10','>10', '>= 20'],
    'Price':[50,25,25,70,20,10,80,10,10]
    'Value':[100,50,50,140,40,20,160,20,20]}    

df1 = pd.DataFrame(data)

b1 = df1[(df1['Range'] == '<10']['Price']
b2 = df1[df1['Range'] == '>10']['Price']
b3 = df1[df1['Range'] == '>= 20']['Price']

totals = [i+j+k for i,j,k in zip(b1,b2,b3)]
greenBars = [i / j * 100 for i,j in zip(b1, totals)]
orangeBars = [i / j * 100 for i,j in zip(b2, totals)]
blueBars = [i / j * 100 for i,j in zip(b3, totals)]


barWidth = 0.5

names = ('low', 'medium', 'high')
r = [0,1,2]
plt.bar(r, greenBars, color='#b5ffb9', edgecolor='white', width=barWidth, label = '$<10')
plt.bar(r, orangeBars, bottom=greenBars, color='#f9bc86', edgecolor='white', width=barWidth, label = '$>10')
plt.bar(r, blueBars, bottom=[i+j for i,j in zip(greenBars, orangeBars)], color='#a3acff', edgecolor='white', width=barWidth, label = '$>=20')


plt.xticks(r, names)
plt.xlabel("group")

plt.legend(loc='upper left', bbox_to_anchor=(1,1), ncol=1)

plt.show()

Added code above to create the stacked plot. Desired Output:

For Low category add annotations on stacks by extracting values from column Value that would be 100, 50 & 50

For Medium values would be 140, 40 & 20.

For High values would be 160, 20 & 20.


Solution

    • A bar plot can be annotated by extracting the bar position locations from ax.patches.
      • The patch data doesn't contain labels corresponding to the dataframe, so associating different sets of data values becomes a bespoke process.
    • In order to annotate with Value instead of Price, there needs to be a way to associate the corresponding values.
      • A dictionary won't work, because there are repeat values
      • Make a pivoted dataframe for Value and a corresponding dataframe for Price. This will ensure corresponding data is in the same location.
    • col_idx and row_idx will be used with .iloc to find the correct value in df_value, with which to annotate the plot.
      • col_idx and row_idx can both be reset or updated in if i%3 == 0, because there are 3 bars and 3 segments, however, if there are differing numbers of bars and segments, there will need to be different reset conditions.
    import pandas as pd
    import matplotlib.pyplot as plt
    
    # create the dataframe
    data = {'Range':['<10','>10', '>= 20', '<10','>10', '>= 20', '<10','>10', '>= 20'],
            'Price':[50,25,25,70,20,10,80,10,10],
            'Value':[100,50,50,140,40,20,160,20,20]}    
    
    df1 = pd.DataFrame(data)
    
    # pivot the price data
    df_price = df1.assign(idx=df1.groupby('Range').cumcount()).pivot(index='idx', columns='Range', values='Price')
    
    Range  <10  >10  >= 20
    idx                   
    0       50   25     25
    1       70   20     10
    2       80   10     10
    
    # pivot the value data
    df_value = df1.assign(idx=df1.groupby('Range').cumcount()).pivot(index='idx', columns='Range', values='Value')
    
    Range  <10  >10  >= 20
    idx                   
    0      100   50     50
    1      140   40     20
    2      160   20     20
    
    # set colors
    colors = ['#b5ffb9', '#f9bc86', '#a3acff']
    
    # plot the price
    ax = df_price.plot.bar(stacked=True, figsize=(8, 6), color=colors, ec='w')
    
    # label the x-axis
    plt.xticks(ticks=range(3), labels=['low', 'med', 'high'], rotation=0)
    
    # x-axis title
    plt.xlabel('group')
    
    # position the legend
    plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
    
    # annotate the bar segments
    # col and row iloc indices for df_value
    col_idx = 0
    row_idx = 0
    
    # iterate through each bar patch from ax
    for i, p in enumerate(ax.patches, 1):
    
        left, bottom, width, height = p.get_bbox().bounds
        v = df_value.iloc[row_idx, col_idx]
        if width > 0:
            ax.annotate(f'{v:0.0f}', xy=(left+width/2, bottom+height/2), ha='center', va='center')
    
            # use this line to add commas for thousands
    #        ax.annotate(f'{v:,}', xy=(left+width/2, bottom+height/2), ha='center', va='center')
        
        row_idx += 1
        if i%3 == 0:  # there are three bars, so update the indices 
            col_idx += 1
            row_idx = 0
    

    enter image description here