Search code examples
pythondataframeindicator

how to structure a python function that take input from data frame to calculate specific indicator


I'm a beginner python coder, I want to build a python function that calculate a specific indicator,

as example, the data is look like:

ID    status        Age    Gender
01    healthy       16     Male
02    un_healthy    14     Female
03    un_healthy    22     Male
04    healthy       12     Female
05    healthy       33     Female

To build a function that calculate the percentage of healthy people by healthy+un_health

def health_rate(healthy, un_healthy,age){
    if (age >= 15):
        if (gender == "Male"):
            return rateMale= (count(healthy)/count(healthy)+count(un_healthy))
        Else
            return rateFemale= (count(healthy)/count(healthy)+count(un_healthy))
    Else 
        return print("underage");

and then just use .apply

but the logic isn't right, I still not get my desired output I want to return Male rate and Female rate


Solution

  • You could use pivot_table (df your dataframe):

    df = df[df.Age >= 15].pivot_table(
        index="status", columns="Gender", values="ID",
        aggfunc="count", margins=True, fill_value=0
    )
    

    Result for your sample dataframe:

    Gender      Female  Male  All
    status                       
    healthy          1     1    2
    un_healthy       0     1    1
    All              1     2    3
    

    If you want percentages:

    df = (df / df.loc["All", :] * 100).drop("All")
    

    Result:

    Gender      Female  Male        All
    status                             
    healthy      100.0  50.0  66.666667
    un_healthy     0.0  50.0  33.333333