EDIT: My question is not a duplicate from: Given a pandas Series that represents frequencies of a value, how can I turn those frequencies into percentages? Because I ask for a plot not for a frequency table. This question is misclassified.
I am trying to replicate a graph bar with frequency or percent for a string variable in Python.
I am able to get this using Stata, but I am failed with Python. The Stata code (below I show my Python code):
clear all
input str10 Genero
"Mujer"
"Mujer"
"Hombre"
"Hombre"
"Hombre"
end
graph bar (percent), over(Genero)
Python code with the same data but failed plot:
import numpy as np
import os
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
os.chdir("C:/Users/Desktop")
import matplotlib.ticker as mtick
df = pd.DataFrame({'Genero': ['Mujer','Mujer','Hombre','Hombre','Hombre']})
print(df)
axx = df.plot.bar(x='Genero')
axx.yaxis.set_major_formatter(mtick.PercentFormatter())
plt.savefig('myfilename.png')
Seems fairly straight forward enough.
df['Genero'].value_counts() #This gives you the value counts of your dataframe
x = df['Genero'].value_counts().index.tolist() #Your xaxis groups
y = (df['Genero'].value_counts().values.tolist()/df['Genero'].value_counts().values.sum()*100) #Your yaxis values as %s
fig, axs = plt.subplots(1, figsize=(20,10))
axs.set_title("TITLE")
axs.set_xlabel('XLABEL')
axs.set_ylabel('YLABEL')
axs.bar(x,y)
plt.show()
The code could be a little cleaner perhaps someone has a better way of doing it but for your purposes should be ok.