How can we manipulate to obtain the expected dataset from the original dataset.
Original dataset:
|Subblock| Blocks |
|:-------|:------ |
|U |CLON1177|
|Z |CLON1177|
|A |CLON1254|
|B |CLON1254|
Expected dataset:
|Blocks |Subblock|
|:----- |:-------|
|CLON1177|U,Z |
|CLON1254|A,B |
You can try the following approach:
# import pandas library
import pandas as pd
# concatenate the string
df['Subblock'] = df.groupby(['Blocks'])['Subblock'].transform(lambda x : ','.join(x))
# drop duplicate data
df = df.drop_duplicates()
# show the dataframe
print(df)