Search code examples
pandasdataframemulti-index

pandas multi-index divide aggregated counts


My raw data is as such

    level0  level1  level2
0   0       A       foo
1   0       A       bar
2   0       B       foo
3   0       B       foo
4   0       B       foo
5   0       B       bar
6   1       A       foo
7   1       A       bar

And then I try to group the counts as such df.groupby(['level0', 'level1', 'level2']).size()

to get this

level0  level1  level2
0       A       foo            1
                bar            1
        B       foo            3
                bar            1
1       A       foo            1
                bar            1
        B       foo            1
                bar            1
2       A       foo            1
                bar            1
        B       foo            1
                bar            1

And now I would simply like to divide the counts of level2 in a ratio of bar/foo to get something like this:

level0  level1      {bar counts/foo counts}
0       A           1.00
        B           0.33
1       A           1.00
        B           1.00
2       A           1.00
        B           1.00

Perhaps I even setup the original groupby wrong but also tried .div, .apply, etc but having a mental block. Thank you for any guidance!


Solution

  • unstack and eval:

    tmp = df.groupby(['level0', 'level1', 'level2']).size()
    
    out = (tmp.unstack().eval('bar/foo')
              .reset_index(name='bar/foo')
          )
    

    Or div:

    tmp = (df.groupby(['level0', 'level1', 'level2'])
             .size().unstack()
          )
    
    out = (tmp['bar']
           .div(tmp['foo'])
           .reset_index(name='bar/foo')
          )
    

    Output:

       level0 level1   bar/foo
    0       0      A  1.000000
    1       0      B  0.333333
    2       1      A  1.000000