Search code examples
pandasdataframegroup-bypairing

how to pair rows of a data frame with respect of a group?


I have a big data. a column of text and a column of id.

column        id
hello world    1
dinner         1
father         1
hi             1
work/related   2
summer         2

I want to pair words whose have the same id and followed each other

output:

 new column        
hello world ,dinner   
dinner ,father         
father, hi  
work/related , summer  

   

Solution

  • Use str.cat to concat every 2 consecutive rows in a group.

     df=df.assign(newcolumn=df.groupby('id')['column'].apply(lambda x: x.str.cat(x.shift(-1),sep=','))).dropna()
    
    
    
         column    id         newcolumn
    0    helloworld   1    helloworld,dinner
    1        dinner   1        dinner,father
    2        father   1            father,hi
    4  work/related   2  work/related,summer