I'm a beginner python coder, I want to build a python function that calculate a specific indicator,
as example, the data is look like:
ID status Age Gender
01 healthy 16 Male
02 un_healthy 14 Female
03 un_healthy 22 Male
04 healthy 12 Female
05 healthy 33 Female
To build a function that calculate the percentage of healthy people by healthy+un_health
def health_rate(healthy, un_healthy,age){
if (age >= 15):
if (gender == "Male"):
return rateMale= (count(healthy)/count(healthy)+count(un_healthy))
Else
return rateFemale= (count(healthy)/count(healthy)+count(un_healthy))
Else
return print("underage");
and then just use .apply
but the logic isn't right, I still not get my desired output I want to return Male rate and Female rate
You could use pivot_table (df
your dataframe):
df = df[df.Age >= 15].pivot_table(
index="status", columns="Gender", values="ID",
aggfunc="count", margins=True, fill_value=0
)
Result for your sample dataframe:
Gender Female Male All
status
healthy 1 1 2
un_healthy 0 1 1
All 1 2 3
If you want percentages:
df = (df / df.loc["All", :] * 100).drop("All")
Result:
Gender Female Male All
status
healthy 100.0 50.0 66.666667
un_healthy 0.0 50.0 33.333333