I am working with a DataFrame where I want to find the % that each element contributes to a group.
For example, I have the following dataframe
a
Out[295]:
c1 c2 c3
0 a p1 1
1 b p1 2
2 c p2 3
3 d p3 4
I want to get the sum of each group by c2 and then divide c3 by this sum. I can use the groupby function to get the sums:
b = a.groupby('c2').aggregate({'c3':sum})
b
Out[298]:
c3
c2
p1 3
p2 3
p3 4
But, then I don't know how to divide JUST the column c3 by those results to get the following:
c1 c2 c3
0 a p1 0.333
1 b p1 0.667
2 c p2 1.000
3 d p3 1.000
You can using transform
b = a.groupby('c2').c3.transform('sum')
b
Out[451]:
0 3
1 3
2 3
3 4
Name: c3, dtype: int64
a['c3']/=b
a
Out[453]:
c1 c2 c3
0 a p1 0.333333
1 b p1 0.666667
2 c p2 1.000000
3 d p3 1.000000