In the below table I need to only fillna for Week columns. NaN should be filled with mean value of all weeks in that row.
+----+---------+------+-------+-------+-------+-------+
| ID | Feature | Paid | Week1 | Week2 | Week3 | Week4 |
+----+---------+------+-------+-------+-------+-------+
| 1 | 1 | 1 | 12 | NaN | NaN | NaN |
+----+---------+------+-------+-------+-------+-------+
| 2 | 0 | 1 | 34 | 23 | NaN | NaN |
+----+---------+------+-------+-------+-------+-------+
| 3 | 1 | 0 | 24 | 13 | 14 | NaN |
+----+---------+------+-------+-------+-------+-------+
Code
df.fillna(df[[Week1,Week2,Week3,Week4]].mean(axis=1),axis=1,inplace=True)
This gives an error saying NotImplementedError: Currently only can fill with dict/Series column by column
You can try via filter()
select columns Named like 'Week' then find mean and store that into a variable(for good performance) and finally fill NaN's
by using fillna()
:
cols=df.filter(regex='Week').columns
m=df[cols].mean(axis=1).round()
df=df.fillna({x:m for x in cols})
output:
ID Feature Paid Week1 Week2 Week3 Week4
0 1 1 1 12 12.0 12.0 12.0
1 2 0 1 34 23.0 28.0 28.0
2 3 1 0 24 13.0 14.0 17.0