Search code examples
pandasdataframemulti-index

How to make calculations on pnadas dataframe with multiple index?


Suppose I have a multi indexed DataFrame:

                      Frequency
occupation    gender           
administrator F              36
              M              43
artist        F              13
              M              15
doctor        M               7
educator      F              26
              M              69
engineer      F               2
              M              65

where the first two columns are indexes. How do I add another column that gives the ratio between F and M?


Solution

  • Use Series.unstack for reshape, so for ratio divide columns:

    df1 = df['Frequency'].unstack()
    df1['ratio'] = df1['F'].div(df1['M'])
    print (df1)
                      F     M     ratio
    administrator  36.0  43.0  0.837209
    artist         13.0  15.0  0.866667
    doctor          NaN   7.0       NaN
    educator       26.0  69.0  0.376812
    engineer        2.0  65.0  0.030769
    entertainment   2.0  16.0  0.125000
    

    If need new column:

    s = df['Frequency'].xs('F', level=1).div(df['Frequency'].xs('M', level=1))
    
    
    df['ratio'] = df.index.droplevel(1).map(s)
    print (df)
                     Frequency     ratio
    administrator F         36  0.837209
                  M         43  0.837209
    artist        F         13  0.866667
                  M         15  0.866667
    doctor        M          7       NaN
    educator      F         26  0.376812
                  M         69  0.376812
    engineer      F          2  0.030769
                  M         65  0.030769
    entertainment F          2  0.125000
                  M         16  0.125000