Search code examples
pythonpandasdataframedata-analysis

Interpolation of a dataframe with immediate data appearing before and after it - Pandas


Let's say I've a dataframe like this -

ID  Weight Height
1   80.0   180.0
2   60.0   170.0
3   NaN    NaN
4   NaN    NaN
5   82.0   185.0

I want the dataframe to be transormed to -

ID  Weight  Height
1   80.0    180.0
2   60.0    170.0
3   71.0    177.5
4   76.5    181.25
5   82.0    185.0

It takes the average of the immediate data available before and after a NaN and updates the missing/NaN value accordingly.


Solution

  • You can use interpolation from the pandas library by using the following:

    df['Weight'], df['Height'] = df.Weight.interpolate(), df.Height.interpolate()
    

    Check the arguments on the documentation for the method of interpolation to tune this to your problem case: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.interpolate.html