Search code examples
pandasdataframeaggregatepandas-groupbydivide

Pandas DataFrame divide single column by the sum of the column groups


I am working with a DataFrame where I want to find the % that each element contributes to a group.

For example, I have the following dataframe

    a
Out[295]: 
  c1  c2  c3
0  a  p1   1
1  b  p1   2
2  c  p2   3
3  d  p3   4

I want to get the sum of each group by c2 and then divide c3 by this sum. I can use the groupby function to get the sums:

b = a.groupby('c2').aggregate({'c3':sum})

b
Out[298]: 
    c3 
c2    
p1   3
p2   3
p3   4

But, then I don't know how to divide JUST the column c3 by those results to get the following:

  c1  c2  c3
0  a  p1   0.333
1  b  p1   0.667
2  c  p2   1.000
3  d  p3   1.000

Solution

  • You can using transform

    b = a.groupby('c2').c3.transform('sum')
    b
    Out[451]: 
    0    3
    1    3
    2    3
    3    4
    Name: c3, dtype: int64
    a['c3']/=b
    a
    Out[453]: 
      c1  c2        c3
    0  a  p1  0.333333
    1  b  p1  0.666667
    2  c  p2  1.000000
    3  d  p3  1.000000