python pandas dataframe missing-data fillna

How to use each vector entry to fill NAN's of a separate groups in a dataframe

Say I have a vector ValsHR which looks like this:

valsHR=[78.8, 82.3, 91.0]

And I have a dataframe MainData

Age  Patient  HR             
21   1        NaN
21   1        NaN
21   1        NaN
30   2        NaN
30   2        NaN
24   3        NaN
24   3        NaN
24   3        NaN

I want to fill the NaNs so that the first value in valsHR will only fill in the NaNs for patient 1, the second will fill the NaNs for patient 2 and the third will fill in for patient 3.

So far I've tried using this: mainData['HR'] = mainData['HR'].fillna(ValsHR) but it fills all the NaNs with the first value in the vector.

I've also tried to use this: mainData['HR'] = mainData.groupby('Patient').fillna(ValsHR) fills the NaNs with values that aren't in the valsHR vector at all.

I was wondering if anyone knew a way to do this?

Solution

Create dictionary by Patient values with missing values, map to original column and replace missing values only:

print (df)
   Age  Patient     HR
0   21        1    NaN
1   21        1    NaN
2   21        1    NaN
3   30        2  100.0 <- value is not replaced
4   30        2    NaN
5   24        3    NaN
6   24        3    NaN
7   24        3    NaN


p = df.loc[df.HR.isna(), 'Patient'].unique()
valsHR = [78.8, 82.3, 91.0]

df['HR'] = df['HR'].fillna(df['Patient'].map(dict(zip(p, valsHR))))
print (df)
   Age  Patient     HR
0   21        1   78.8
1   21        1   78.8
2   21        1   78.8
3   30        2  100.0
4   30        2   82.3
5   24        3   91.0
6   24        3   91.0
7   24        3   91.0

If some groups has no NaNs:

print (df)
   Age  Patient     HR
0   21        1    NaN
1   21        1    NaN
2   21        1    NaN
3   30        2  100.0 <- group 2 is not replaced
4   30        2  100.0 <- group 2 is not replaced
5   24        3    NaN
6   24        3    NaN
7   24        3    NaN


p = df.loc[df.HR.isna(), 'Patient'].unique()
valsHR = [78.8, 82.3, 91.0]

df['HR'] = df['HR'].fillna(df['Patient'].map(dict(zip(p, valsHR))))
print (df)
   Age  Patient     HR
0   21        1   78.8
1   21        1   78.8
2   21        1   78.8
3   30        2  100.0
4   30        2  100.0
5   24        3   82.3
6   24        3   82.3
7   24        3   82.3