Search code examples
pythonpandasgroup-bypandas-applypython-applymap

Pandas : Concat rows of a dataframe with same index to form custom string in pairs


Say I have a dataframe

df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'],
                   'colB' : ['DEF', 'MNO', 'VWX', '456'],
                   'colC' : ['GHI', 'PQR', 'YZ', '789'],}, index = [0,0,1,1])
   colA colB colC
0  ABC   DEF  GHI 
0  JKL   MNO  PQR
1  STU   VWX   YZ
1  123   456  789

Its guranteed that every pair will have the same index, so we would like the end result to be :

     colA        colB       colC
0  ABC_JKL_0   DEF_MNO_0  GHI_PQR_0 
1  STU_123_1   VWX_456_1   YZ_789_1

where the suffix _number is the index of that group.

I tried doing this by iterating through rows but that's taking a lot of time. I was thinking of something like .groupby(level=0) but can't figure out the next aggregation apply part


Solution

  • IIUC, you can try something like this using .agg and a lambda function or you can add it into the dataframe after the groupby:

    df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0]))
    

    Output:

            colA       colB       colC
    0  ABC_JKL_0  DEF_MNO_0  GHI_pQR_0
    1  STU_123_1  VWX_456_1   YZ_789_1
    

    Or

    df_out=df.groupby(level=0).agg('_'.join)
    df_out = df_out.add('_'+df_out.index.to_series().astype(str), axis=0)
    print(df_out)