Search code examples
python-2.7pandasrowunique

find unique values row wise on comma separated values


For a dataframe like below:

df = pd.DataFrame({'col':['abc,def,ghi,jkl,abc','abc,def,ghi,def,ghi']})

How to get unique values of the column col row wise in a new column like as follows:

          col             unique_col
0  abc,def,ghi,jkl,abc    abc,def,ghi,jkl
1  abc,def,ghi,def,ghi    abc,def,ghi

I tried using iteritems but got Attribute error :

for i, item in df.col.iteritems():
    print item.unique()

Solution

  • import pandas as pd
    df = pd.DataFrame({'col':['abc,def,ghi,jkl,abc','abc,def,ghi,def,ghi']})
    
    
    def unique_col(col):
        return ','.join(set(col.split(',')))
    
    df['unique_col'] = df.col.apply(unique_col)
    

    result:

        col     unique_col
    0   abc,def,ghi,jkl,abc     ghi,jkl,abc,def
    1   abc,def,ghi,def,ghi     ghi,abc,def