Search code examples
pythonpandasdataframesummulti-index

Sum values for second level of multi-index in pandas dataframe


I have a pandas DataFrame with a multi-index of 'County ID' and 'County'. There is a 'Workforce' column and I need to sum all the values in the column, then add that to another column in the DataFrame, but I need to sum only the values in each level. However ,I have been unable to get pandas to sum in this manner.

I've tried Many different things including making other DataFrames to sum each index then re-add to the other DataFrame. I have also tried loops but have been unable to not sum all the values in the column together.

My dataframe looks like this: enter image description here

*COUNTY ID*   *COUNTY*             INDUSTRY  WORKFORCE  TOTAL WORKFORCE
12001      |  Alachua County, FL    Retail     25329        0
           |  Alachua County, FL    Health     22914        0
8013       |  Boulder County, CO    Education  24123        0
           |  Boulder County, CO    Finance    19478        0
# And so on in that manner 

'TOTAL WORKFORCE' is a dummy column I added as an int64 so that I could place a numerical value in it as a result of the sum I need.

I want to take all the workforce counts and add them together per each county so it would look like this:

*COUNTY ID*   *COUNTY*             INDUSTRY  WORKFORCE  TOTAL WORKFORCE
12001      |  Alachua County, FL    Retail     25329        48243
           |  Alachua County, FL    Health     22914        48243
8013       |  Boulder County, CO    Education  24123        43601
           |  Boulder County, CO    Finance    19478        43601

Any help would be great thanks!


Solution

  • A groupby on level=0 and transforming with the sum should do:

    df['TOTALWORKFORCE'] = df.groupby(level=0).WORKFORCE.transform('sum')