I'm trying to build a population pyramid from a pandas df using seaborn. The problem is that some data isn't displayed. As you can see from the plot I created there's some missing data. The Y-axis ticks are 21 and the df's age classes are 21 so why don't they match? What am I missing?
Here's the code I wrote:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
df = pd.DataFrame({'Age': ['0-4','5-9','10-14','15-19','20-24','25-29','30-34','35-39','40-44','45-49','50-54','55-59','60-64','65-69','70-74','75-79','80-84','85-89','90-94','95-99','100+'],
'Male': [-49228000, -61283000, -64391000, -52437000, -42955000, -44667000, -31570000, -23887000, -22390000, -20971000, -17685000, -15450000, -13932000, -11020000, -7611000, -4653000, -1952000, -625000, -116000, -14000, -1000],
'Female': [52367000, 64959000, 67161000, 55388000, 45448000, 47129000, 33436000, 26710000, 25627000, 23612000, 20075000, 16368000, 14220000, 10125000, 5984000, 3131000, 1151000, 312000, 49000, 4000, 0]})
AgeClass = ['100+','95-99','90-94','85-89','80-84','75-79','70-74','65-69','60-64','55-59','50-54','45-49','40-44','35-39','30-34','25-29','20-24','15-19','10-14','5-9','0-4']
bar_plot = sns.barplot(x='Male', y='Age', data=df, order=AgeClass)
bar_plot = sns.barplot(x='Female', y='Age', data=df, order=AgeClass)
bar_plot.set(xlabel="Population (hundreds of millions)", ylabel="Age-Group", title = "Population Pyramid")
As explained by JohanC, the data is not missing, it's just very small compared to the other bars.
Another factor is that you seem to have a white border around each of your bars, which hides the very small bars at the top. Try putting lw=0
in your call to barplot
. This is what I am getting:
bar_plot = sns.barplot(x='Male', y='Age', data=df, order=AgeClass, lw=0)
bar_plot = sns.barplot(x='Female', y='Age', data=df, order=AgeClass, lw=0)