pandas interpolation nan missing-data imputation

Fill only last among of consecutive NaN in Pandas by mean of previous and next valid values

Fill only last among of consecutive NaN in Pandas by mean of previous and next valid values. If one NaN, then fill with mean of next and previous. If two consecutive NaN, impute second one with mean of next and previous valid values.

Series:

expected output:

Solution

Idea is remove consecutive missing values without last, then use interpolate and assign back last missing value by condition:

m =  df['header'].isna()
mask = m & ~m.shift(-1, fill_value=False)

df.loc[mask, 'header'] = df.loc[mask | ~m, 'header'].interpolate()
print (df)
    header
0     10.0
1     20.0
2     20.0
3     20.0
4     30.0
5      NaN
6     35.0
7     40.0
8     10.0
9      NaN
10     NaN
11    30.0
12    50.0

Details:

print (df.assign(m=m, mask=mask))
    header      m   mask
0     10.0  False  False
1     20.0  False  False
2     20.0   True   True
3     20.0  False  False
4     30.0  False  False
5      NaN   True  False
6     35.0   True   True
7     40.0  False  False
8     10.0  False  False
9      NaN   True  False
10     NaN   True  False
11    30.0   True   True
12    50.0  False  False


print (df.loc[mask | ~m, 'header'])
0     10.0
1     20.0
2      NaN
3     20.0
4     30.0
6      NaN
7     40.0
8     10.0
11     NaN
12    50.0
Name: header, dtype: float64

Solution for interpolate per groups is:

df.loc[mask, 'header'] = df.loc[mask | ~m, 'header'].groupby(df['groups'])
                                                    .transform(lambda x: x.interpolate())