Search code examples
pandaspercentagecalculated-columns

Calculate Percent of Groupby Variable to Sum Column


I'm not finding a similar example to understand this in python. I have a dataset that looks like this:

ID    Capacity
A     50
A     50
A     50
B     30
B     30
B     30
C    100
C    100
C    100

I need to find the percent of each ID for the sum of the "Capacity" column. So, the answer looks like this:

ID    Capacity   Percent_Capacity
A     50         0.2777
A     50         0.2777
A     50         0.2777
B     30         0.1666
B     30         0.1666
B     30         0.1666
C    100         0.5555
C    100         0.5555
C    100         0.5555

Thank you - still learning python.


Solution

  • total=df.groupby('ID')['Capacity'].first().sum()
    df['percent_capacity'] = df['Capacity']/total
    df
    
        ID  Capacity    percent_capacity
    0   A         50    0.277778
    1   A         50    0.277778
    2   A         50    0.277778
    3   B         30    0.166667
    4   B         30    0.166667
    5   B         30    0.166667
    6   C        100    0.555556
    7   C        100    0.555556
    8   C        100    0.555556