Search code examples
pythonpandasmatplotlibstacked-bar-chartplot-annotations

Create a stacked bar plot of percentages and annotate with count


My raw data:

I have this data (df) and I get their percentages (data=rel) and plotted a stacked bar graph.

Now I want to add values (non percentage values) to the centers of each bar but from my first dataframe.

My code for now:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from csv import reader
import seaborn as sns

df = pd.DataFrame({'IL':['Balıkesir', 'Bursa', 'Çanakkale', 'Edirne', 'İstanbul', 'Kırklareli', 'Kocaeli', 'Sakarya','Tekirdağ','Yalova'],'ENGELLIUYGUN':[7,13,3,1,142,1,14,1,2,2],'ENGELLIUYGUNDEGIL':[1,5,0,0,55,0,3,0,1,0]})
iller=df.iloc[:,[0]]

df_total = df["ENGELLIUYGUN"] + df["ENGELLIUYGUNDEGIL"]
df_rel = df[df.columns[1:]].div(df_total, 0)*100

rel=[]
rel=pd.DataFrame(df_rel)
rel['İller'] = iller

d=df.iloc[:,[1]] #I want to add these values to the center of blue bars.
f=df.iloc[:,[2]] #I want to add these values to the center of green bars.

sns.set_theme (style='whitegrid')

ax=rel.plot(x='İller',kind='bar', stacked=True, color=["#3a88e2","#5c9e1e"], label=("Uygun","Uygun Değil"))

plt.legend(["Evet","Hayır"],fontsize=8, bbox_to_anchor=(1, 0.5))

plt.xlabel('...........',fontsize=12)
plt.ylabel('..........',fontsize=12)
plt.title('.............',loc='center',fontsize=14)
plt.ylim(0,100)
ax.yaxis.grid(color='gray', linestyle='dashed')

plt.show()

I have this for now:

my plotting now

I want the exact same style of this photo:

example

I am using Anaconda-Jupyter Notebook.


Solution

    • Answering: I want to add values (non percentage values) to the centers of each bar but from my first dataframe.
    • The correct way to annotate bars, is with .bar_label, as explained in this answer.
      • The values from df can be sent to the label= parameter instead of the percentages.
    • This answer shows how to succinctly calculate the percentages, but plots the counts and annotates with percentage and value, whereas this OP wants to plot the percentage on the y-axis and annotate with counts.
    • This answer shows how to place the legend at the bottom of the plot.
    • This answer shows how to format the axis tick labels as percent.
    • See pandas.DataFrame.plot for an explanation of the available parameters.
    • I am using Anaconda-Jupyter Notebook. Everything from the comment, # plot percent; ..., should be in the same notebook cell.
    • Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2
    import pandas as pd
    import matplotlib.ticker as tkr
    
    # sample data
    df = pd.DataFrame({'IL': ['Balıkesir', 'Bursa', 'Çanakkale', 'Edirne', 'İstanbul', 'Kırklareli', 'Kocaeli', 'Sakarya','Tekirdağ','Yalova'],
                       'ENGELLIUYGUN': [7, 13, 3, 1, 142, 1, 14, 1, 2, 2],
                       'ENGELLIUYGUNDEGIL': [1, 5, 0, 0, 55, 0, 3, 0, 1, 0]})
    
    # set IL as the index
    df = df.set_index('IL')
    
    # calculate the percent
    per = df.div(df.sum(axis=1), axis=0).mul(100)
    
    # plot percent; adjust rot= for the rotation of the xtick labels
    ax = per.plot(kind='bar', stacked=True, figsize=(10, 8), rot=0,
                  color=['#3a88e2', '#5c9e1e'], yticks=range(0, 101, 10),
                  title='my title', ylabel='', xlabel='')
    
    # move the legend
    ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), ncol=2, frameon=False)
    
    # format the y-axis tick labels
    ax.yaxis.set_major_formatter(tkr.PercentFormatter())
    
    # iterate through the containers
    for c in ax.containers:
        
        # get the current segment label (a string); corresponds to column / legend
        col = c.get_label()
        
        # use label to get the appropriate count values from df
        # customize the label to account for cases when there might not be a bar section
        labels = [v if v > 0 else '' for v in df[col]]
        # the following will also work
        # labels = df[col].replace(0, '')
        
        # add the annotation
        ax.bar_label(c, labels=labels, label_type='center', fontweight='bold')
    

    enter image description here

    Alternate Annotation Implementation

    • Since the column names in df and per are the same, they can be extracted directly from per.
    # iterate through the containers and per column names
    for c, col in zip(ax.containers, per):
        
        # add the annotations with custom labels from df
        ax.bar_label(c, labels=df[col].replace(0, ''), label_type='center', fontweight='bold')