Search code examples
pythonpandasgroup-by

Add a new column with same value for every group in Pandas


I have a pandas dataframe in the below format

 id   name   zip
 123  aaa    614
 123  nnn    615
 341  yun    318
 441  ros    911

For every group of unique id, I create a new column based on the column id. Below is the code for that, but instead of creating the same uuid for a similar id, it creates different ones.

 df_complete = pd.DataFrame([])
 for new_id in df['id'].unique():
    df_interim = df[df['id'] == new_id]
    df_interim['uuid'] = df_interim['id'].apply(lambda _: uuid.uuid4())
    df_complete.append(df_interim)

Expected output:

 id   name   zip   uuid
 123  aaa    614   uuid_1
 123  nnn    615   uuid_1
 341  yun    318   uuid_2
 441  ros    911   uuid_3

Any leads or suggestions would be appreciated


Solution

  • You can use .groupby() and .transform() and call the uuid function within .transform(), as follows:

    .transform() helps to set the same value (uuid) for all entries in the same group (same id).

    df['uuid'] = df.groupby('id')['id'].transform(lambda _: uuid.uuid4())
    

    Result:

    print(df)
    
    
        id name  zip                                  uuid
    0  123  aaa  614  e7d7c519-52e0-486f-99f2-722b73c16242
    1  123  nnn  615  e7d7c519-52e0-486f-99f2-722b73c16242
    2  341  yun  318  dc24c9d0-4c52-44ab-ac19-c6ce64fed5b7
    3  441  ros  911  0a14dc45-cbe7-43aa-8b54-90ef88ca8a7e