Search code examples
python-3.xpandas-groupbyseaborn

Seaborn barplot display numeric values from groupby


Data from: https://www.kaggle.com/datasets/prasertk/homicide-suicide-rate-and-gdp

I have a working barplot.

Code:

df_mean_country = df.groupby(["country", "iso3c", "incomeLevel"])["Intentional homicides (per 100,000 people)"].mean().reset_index()
top_ten_hom = df_mean_country.sort_values("Intentional homicides (per 100,000 people)", ascending=False).head(10)
print(top_ten_hom, '\n')


plt.figure(figsize=(16, 8), dpi=200)
plt.xticks(rotation=45, fontsize=14)
plt.ylabel("Suicide mortality rate", fontsize=16, weight="bold")
plt.title("Top 10 countries with Homicides per 100,000 people", fontname="Impact", fontsize=25)
xy = sns.barplot(data=top_ten_hom,
                 y="Intentional homicides (per 100,000 people)",
                 x="country",
                 hue="incomeLevel",
                 dodge=False)
for item in xy.get_xticklabels():
    item.set_rotation(45)
    xy.bar_label(xy.containers[0])
plt.legend(fontsize=14, title="Income Level")
plt.tight_layout()
plt.show()

Output: enter image description here

The issue is that it is only displaying the values for the 'Lower Middle Income' bars.

I assume that this is somehow a function of the groupby used to create the df, but I have never had this happen before.

The values are all present:

                   country iso3c          incomeLevel  Intentional homicides (per 100,000 people)
68             El Salvador   SLV  Lower middle income                                      74.178
47                Colombia   COL  Upper middle income                                      50.996
102               Honduras   HND  Lower middle income                                      47.886
218           South Africa   ZAF  Upper middle income                                      42.121
119                Jamaica   JAM  Upper middle income                                      40.821
137                Lesotho   LSO  Lower middle income                                      36.921
256          Venezuela, RB   VEN       Not classified                                      36.432
258  Virgin Islands (U.S.)   VIR          High income                                      35.765
177                Nigeria   NGA  Lower middle income                                      34.524
95               Guatemala   GTM  Upper middle income                                      33.251 

I want the values displayed on all of the bars, not just the 'Lower Middle Income' bars.


Solution

  • Each hue value leads to one entry in ax.containers. You can loop through them to add the labels.

    Some additional remarks:

    • Matplotlib has both an "old" pyplot interface and a "new" object-oriented interface (already more than 10 years ago). It helps readability and maintainability not to mix them. Some newer functions only exist in the object-oriented interface (e.g. ax.tick_params()).
    • Changing labels etc. best happens after creating the seaborn plot, as seaborn sets its own labels and parameters.
    • To easier map tutorials and code examples to your code, it helps to name the return value of sns.barplot as something like ax.
    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    
    df = pd.read_csv('suicide homicide gdp.csv')
    df_mean_country = df.groupby(["country", "iso3c", "incomeLevel"])[
         "Intentional homicides (per 100,000 people)"].mean().reset_index()
    top_ten_hom = df_mean_country.sort_values("Intentional homicides (per 100,000 people)", ascending=False).head(10)
    
    plt.figure(figsize=(16, 8), dpi=200)
    ax = sns.barplot(data=top_ten_hom,
                     y="Intentional homicides (per 100,000 people)",
                     x="country",
                     hue="incomeLevel",
                     dodge=False)
    ax.set_ylabel("Suicide mortality rate", fontsize=16, weight="bold")
    ax.set_xlabel("")
    ax.set_title("Top 10 countries with Homicides per 100,000 people", fontname="Impact", fontsize=25)
    
    ax.tick_params(axis='x', rotation=45, size=0, labelsize=14)
    
    for bars in ax.containers:
         ax.bar_label(bars, fontsize=12, fmt='%.2f')
    ax.legend(fontsize=14, title="Income Level", title_fontsize=18)
    plt.tight_layout()
    plt.show()
    

    calling bar_label for sns.barplot with hue