Search code examples
pythonpandasmulti-index

Pandas, arithmetic operation on grouped data


let say I have a pandas data frame and already grouped as

grp=df.groupby(['a','b' ]).sum()

enter image description here

now I would like to calculate for every group a , the percentage of b for each column , for example: P1, aaaa = 11/484, P1, aaac = 8/357, N1, aaaa = 61/7183 so on ....

Reproducible grouped data

pd.DataFrame({'aaaa': {('P 1', 0): 484,('P 1', 1): 11,}})

enter image description here


Solution

  • You can do:

    grp.loc[(slice(None), 1),:].droplevel(1)/grp.loc[(slice(None), 0),:].droplevel(1)
    

    In practice whith grp.loc[(slice(None), 1),:] and grp.loc[(slice(None), 0),:] I extract only the rows with b==1 and b==0 (try yourself and see the output); after that I need to remove the b level (.droplevel(1)) to make these two objects have the same index (the columns are already shared); finally I divided this two matrices with / (now I can do it because now they have same index and columns). Hope it is clear :)