Search code examples
pythonpandasloopsconcatenation

Concat and merge dataframe


How can we manipulate to obtain the expected dataset from the original dataset.

Original dataset:

|Subblock| Blocks |
|:-------|:------ |
|U       |CLON1177|
|Z       |CLON1177|
|A       |CLON1254|
|B       |CLON1254|

Expected dataset:

|Blocks  |Subblock|
|:-----  |:-------|
|CLON1177|U,Z     |
|CLON1254|A,B     |

Solution

  • You can try the following approach:

    # import pandas library 
    import pandas as pd 
      
    # concatenate the string 
    df['Subblock'] = df.groupby(['Blocks'])['Subblock'].transform(lambda x : ','.join(x)) 
      
    # drop duplicate data 
    df = df.drop_duplicates()    
      
    # show the dataframe 
    print(df)