Search code examples
pythonpandasjupyter-notebooksumconcatenation

Merge Rows in pandas


I want to merge rows in my df so I have one unique row per ID/Name with other values either summed (revenue) or concatenated (subject and product).

My df is similar to this:

ID   Name   Revenue   Subject   Product
123  John   125       Maths     A
123  John   75        English   B
246  Mary   32        History   B
312  Peter  67        Maths     A
312  Peter  39        Science   C

I would like to merge the rows so the output looks like this:

ID   Name   Revenue   Subject        Product
123 John    200       Maths English  A B
246 Mary    32        History        B
312 Peter   106       Maths Science  A C

Solution

  • Define a utility function as & use agg.

    def f(x): return ' '.join(list(x))
    
    df.groupby(['ID', 'Name']).agg( 
      {'Revenue': 'sum', 'Subject': f, 'Product': f} 
    )