Search code examples
pythonpandasgroup-byunique

Store unique values from group to another column in Pandas


I have a problem with getting unique values from one column to another as value.

df I have:

id  value1 valueNo
1    12      140
1    13      149
1    11      149
2    11      nan
2    11      150
3    15      145
3    12      149

Desired output would be

id  value1 valueNo   uniqueNo
1    12      140      140, 149
1    13      149      140, 149
1    11      149      140, 149
2    11      nan      150
2    11      150      150
3    15      145      145, 149
3    12      149      145, 149

I have tried with a few approaches but nothing works for me.

df['uniqueNo']=df.groupby(['id'])['valueNo'].apply(lambda x: x.unique())
d['uniqueNo'] = df.groupby(['id'])['valueNo'].apply(list)

Solution

  • If missing values are not problem use GroupBy.transform with unique:

    df['uniqueNo']=df.groupby(['id'])['valueNo'].transform('unique')
    print (df)
       id  value1  valueNo        uniqueNo
    0   1      12    140.0  [140.0, 149.0]
    1   1      13    149.0  [140.0, 149.0]
    2   1      11    149.0  [140.0, 149.0]
    3   2      11      NaN    [nan, 150.0]
    4   2      11    150.0    [nan, 150.0]
    5   3      15    145.0  [145.0, 149.0]
    6   3      12    149.0  [145.0, 149.0]
    

    If need remove them solution is first remove them, aggregate unique and map to new column:

    s = df.dropna(subset=['valueNo'])['valueNo'].astype(int).groupby(df['id']).unique()
    #if converting to intgers is not necessary
    #s = df.dropna(subset=['valueNo']).groupby('id')['valueNo'].unique()
    df['uniqueNo'] = df['id'].map(s)
    print (df)
       id  value1  valueNo    uniqueNo
    0   1      12    140.0  [140, 149]
    1   1      13    149.0  [140, 149]
    2   1      11    149.0  [140, 149]
    3   2      11      NaN       [150]
    4   2      11    150.0       [150]
    5   3      15    145.0  [145, 149]
    6   3      12    149.0  [145, 149]