Search code examples
pythonpandasmatplotlibseaborn

Plot frequency or percent for string var


EDIT: My question is not a duplicate from: Given a pandas Series that represents frequencies of a value, how can I turn those frequencies into percentages? Because I ask for a plot not for a frequency table. This question is misclassified.

I am trying to replicate a graph bar with frequency or percent for a string variable in Python.

I am able to get this using Stata, but I am failed with Python. The Stata code (below I show my Python code):

clear all
input str10 Genero
"Mujer"
"Mujer"
"Hombre"
"Hombre"
"Hombre"
end

graph bar (percent), over(Genero)

Stata

Python code with the same data but failed plot:

import numpy as np
import os
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

os.chdir("C:/Users/Desktop")
import matplotlib.ticker as mtick

df = pd.DataFrame({'Genero': ['Mujer','Mujer','Hombre','Hombre','Hombre']})

print(df)


axx = df.plot.bar(x='Genero')
axx.yaxis.set_major_formatter(mtick.PercentFormatter())
plt.savefig('myfilename.png')

Solution

  • Seems fairly straight forward enough.

    df['Genero'].value_counts() #This gives you the value counts of your dataframe
    x = df['Genero'].value_counts().index.tolist() #Your xaxis groups
    y = (df['Genero'].value_counts().values.tolist()/df['Genero'].value_counts().values.sum()*100) #Your yaxis values as %s
    
    fig, axs = plt.subplots(1, figsize=(20,10))
    axs.set_title("TITLE")
    axs.set_xlabel('XLABEL')
    axs.set_ylabel('YLABEL')
    
    axs.bar(x,y)
    
    plt.show()
    

    The code could be a little cleaner perhaps someone has a better way of doing it but for your purposes should be ok.