I have a data frame as follow:
Obs. ID Name type
1) 123 abc duplicate
2) 123 abc duplicate
3) 145 abc abc
4) 156 abc duplicate
5) 156 abc duplicate
if ID is same, like in obs. 1 and 2 or 4 and 5 then I want to create a new variable type=duplicate else type=vaule in Name variable(i.e abc)
We can use duplicated
with np.where
to set the values according to the result:
df['type'] = np.where(df.duplicated('ID', False), 'Duplicate', 'Single')
print(df)
Obs. ID Name type
0 1) 123 abc Duplicate
1 2) 123 abc Duplicate
2 3) 145 abc Single
3 4) 156 abc Duplicate
4 5) 156 abc Duplicate
For the update, you just need a simple tweek:
df['type'] = np.where(~df.duplicated('ID', False), df.Name, 'Duplicate')
print(df)
Obs. ID Name type
0 1) 123 abc Duplicate
1 2) 123 abc Duplicate
2 3) 145 abc abc
3 4) 156 abc Duplicate
4 5) 156 abc Duplicate