Search code examples
pandasgroup-by

Pandas: adding a new column to MultiIndex


I am having trouble working out a way to add a extra column that has the sum of the items in the bin column.

This is because I would like a way to show bins that have less than so many items remanding later on. But still need to see the individual quantity too.

I am right in thinking adding an extra column by applying a lambda function?

Appreciate any help or advise, thank you.

data = [['Jacket 1', 10, 'A'], ['Jacket 2', 2, 'A'], ['Jacket 3', 1, 'B'], ['Jacket 4', 3, 'B']]

df = pd.DataFrame(data, columns=['item', 'qty', 'bin'])

df

item    qty bin
0   Jacket 1    10  A
1   Jacket 2    2   A
2   Jacket 3    1   B
3   Jacket 4    3   B

grouped = df.groupby(['bin', 'item']).agg({'qty' : 'sum'})


grouped

               qty
bin item    
A   Jacket 1    10
    Jacket 2    2
B   Jacket 3    1
    Jacket 4    3

Below is want I am trying to get. If this is possible?

               qty     Total 
bin item    
A   Jacket 1    10      12
    Jacket 2    2
B   Jacket 3    1       4
    Jacket 4    3

Solution

  • If you really want to get creative and use the walrus operator, you can do this in a one-liner:

    (dfs:=df.groupby(['bin', 'item']).agg({'qty' : 'sum'})).assign(total = dfs.groupby('bin')['qty'].transform('sum').drop_duplicates())
    

    Output:

                  qty  total
    bin item                
    A   Jacket 1   10   12.0
        Jacket 2    2    NaN
    B   Jacket 3    1    4.0
        Jacket 4    3    NaN