Search code examples
pythonpandasdataframeduplicatesseries

Remove duplicates in a row pandas


I have a df

Name  Symbol              Dummy
A     (BO),(BO),(AD),(TR)   2
B     (TV),(TV),(TV)        2
C     (HY)                  2
D     (UI)                  2

I need df as

Name  Symbol              Dummy
A     (BO),(AD),(TR)        2
B     (TV)                  2
C     (HY)                  2
D     (UI)                  2

Tried with this function but not working as expected.

drop_duplicates

Solution

  • Split the strings around delimiter ,, then dedupe using dict.fromkeys which also preserves the order of strings, finally join around delimiter ,

    df['Symbol'] = df['Symbol'].str.split(',').map(dict.fromkeys).str.join(',')
    

      Name          Symbol  Dummy
    0    A  (BO),(AD),(TR)      2
    1    B            (TV)      2
    2    C            (HY)      2
    3    D            (UI)      2