Search code examples
pythonpandasmulti-index

vectorized way to add a calculated row to a multindex's subindex


I cannot get my head around how to use groupby to solve the following example:

df = pd.DataFrame(
    {
        'kitchen': ['galley', 'house', 'restaurant', 'caterer'] * 3,
        'products': ['chocolate', 'tart', 'pie', ] * 4,
        'menu_a': [pd.np.random.randint(100000, 999999) for _ in range(12)],
        'menu_b': [pd.np.random.randint(100000, 999999) for _ in range(12)],
        'menu_c': [pd.np.random.randint(100000, 999999) for _ in range(12)]
    }
).set_index(['kitchen', 'products']).sort_index()

df

What I want to do is replace the "pie" and "tart" rows of each kitchen with the sum of pie+tart for each kitchen.

enter image description here

So for example, in the galley kitchen, the new row under products would be pastries and the value under menu_a would be 333163+612456 = 945619 for each of the kitchen x product x menus.

I've tried many versions stack() unstack() and groupby() mixed togther but cannot quite get the result. The alternative is to do this iteratively/apply()'d outside, which is gross, and this is a frequent problem I encounter. Would like to know how to do it right.


Solution

  • Select rows by second level, sum and add second level:

    df1 = (df.loc[pd.IndexSlice[:, ['pie','tart']], :]
             .sum(level=0)
             .assign(products='total')
             .set_index('products', append=True))
    

    Then concat to original and remove used values by list:

    df = pd.concat([df, df1]).drop(['pie','tart'], level=1).sort_index()
    print (df)
                           menu_a   menu_b   menu_c
    kitchen    products                            
    caterer    chocolate   907615   167480   921843
               total       749664   786464   872046
    galley     chocolate   939850   382545   525525
               total      1204359   907760  1267475
    house      chocolate   701797   106570   572014
               total      1215235  1058951   812935
    restaurant chocolate   734501   637600   216367
               total      1846097   345020   517969