Search code examples
pythonpandasgroup-by

Pandas groupby and add additional column and retain other columns in same DF


I have a df with 2 rows containing title,url,sku,price,mrp columns and i need to group by and add additional column which is sum of price and sum of mrp

EX DF

Base SKU Price MRP Title url Prod Sku
1 100 110 abcd www.abcd.com 1_a
1 50 55 abcd www.abcd.com 1_b

expected output

Base SKU Price MRP Title url Prod Sku Combined Price Combined MRP
1 100 110 abcd www.abcd.com 1_a 150 165
1 50 55 efgh www.efgh.com 1_b 150 165

i have tried 1.

updated_df = df.groupby(['Base SKU']).agg({'Price': 'sum','MRP':'sum'}).reset_index()

but output iam getting is

Base SKU Combined Price Combined MRP
1 150 165
1 150 165

2.

updated_df = df.groupby(['Base SKU']).agg({'Price': 'sum','MRP':'sum','Title': 'first', 'url': 'first','Prod Sku': 'first'}).reset_index()

but output getting is

Base SKU Price MRP Title url Prod Sku
1 150 165 abcd www.abcd.com 1_a

is there any way to get expected output?


Solution

  • Use groupby + transform

    df['Combined MRP'] = df.groupby(['Base SKU'])['MRP'] \
                           .transform('sum')
    

    The catch is that transform will broadcast the results back to the original size of each group, making assignment way easier.