I have a pandas dataframe which has more than 4 columns. Some values in the col1 are missing and I want to set those missing values based on the following approach:
What's the best way to do this?
Based on your logic, you can do something as follows, where each row of fillna
corresponds to a bullet point in your question, in the same order:
df['col1'] = (df['col1']
.fillna(df.groupby(['col2','col3','col4'])['col1'].transform('mean'))
.fillna(df.groupby(['col2','col3'])['col1'].transform('mean'))
.fillna(df.groupby(['col2'])['col1'].transform('mean')
.fillna(df['col1'].mean())
)