I have a pandas dataframe in the below format
id name zip
123 aaa 614
123 nnn 615
341 yun 318
441 ros 911
For every group of unique id, I create a new column based on the column id. Below is the code for that, but instead of creating the same uuid for a similar id, it creates different ones.
df_complete = pd.DataFrame([])
for new_id in df['id'].unique():
df_interim = df[df['id'] == new_id]
df_interim['uuid'] = df_interim['id'].apply(lambda _: uuid.uuid4())
df_complete.append(df_interim)
Expected output:
id name zip uuid
123 aaa 614 uuid_1
123 nnn 615 uuid_1
341 yun 318 uuid_2
441 ros 911 uuid_3
Any leads or suggestions would be appreciated
You can use .groupby()
and .transform()
and call the uuid
function within .transform()
, as follows:
.transform()
helps to set the same value (uuid) for all entries in the same group (same id
).
df['uuid'] = df.groupby('id')['id'].transform(lambda _: uuid.uuid4())
Result:
print(df)
id name zip uuid
0 123 aaa 614 e7d7c519-52e0-486f-99f2-722b73c16242
1 123 nnn 615 e7d7c519-52e0-486f-99f2-722b73c16242
2 341 yun 318 dc24c9d0-4c52-44ab-ac19-c6ce64fed5b7
3 441 ros 911 0a14dc45-cbe7-43aa-8b54-90ef88ca8a7e