So, I hope you know the famous Titanic question. This is what I did so far by learning the tutorial. Now I want to replace NaN values of column: Age
with median values of part of Age
column. But the selected part should have a certain value for "Title"
For example, I want to replace NaN of Age where Title="Mr", so the median value for "Mr" would be filled in missing places which also has Title=="Mr".
I tried this:
for val in data["Title"].unique():
median_age = data.loc[data.Title == val, "Age"].median()
data.loc[data.Title == val, "Age"].fillna(median_age, inplace=True)
But still Age shows up as NaN
. How can I do this?
Use combine_first
to fill NaN. I have no column Title
from my dataset but it's the same:
df['Age'] = df['Age'].combine_first(df.groupby('Sex')['Age'].transform('median'))