How Can I impute every column in a dataframe with its respective class mean?

If I have two groups, 0 and 1 in a column labeled "Group Label", how can I impute the class mean for every other column based on that group, not based on the mean of the entire column

This is the code I have so far, which is splitting the DF into two groups but is not calculating the correct mean:

df1 = df.groupby("group_label").transform(lambda x: x.fillna(x.mean()))

Also, it seems to be dropping string columns such as my ID column.

Thanks in advance

Solution

You may use .transform('mean') together with grouped DataFrame within .fillna(), but you would need to specify columns you want to apply to:

# for a single column
df['col1'] = df['col1'].fillna(
    df.groupby('Group_Label')['col1'].transform('mean'))

# for multiple columns
df[['col1', 'col2', ...]] = df[['col1', 'col2', ...]].fillna(
    df.groupby('Group_Label')[['col1', 'col2', ...]].transform('mean'))