Search code examples
pandaslinear-interpolation

How can you use linear interpolation to impute missing time-series data?


Consider the pandas time-series,

0        NaN
1       72.0
2       63.0
3       30.0
4       26.0
5        NaN
6        NaN
7       35.0
8        NaN
9       37.0
...

the NaNs are present from where the sensor did not record the data at that time point. For this reason, we should just be able to use linear interpolation to interpolate the missing data. For example, entry 8 should be 36, perhaps entries 5 and 6 could be 29.0 and 32.0 respectively. Is there any library / function that can do this? Thank you.


Solution

  • Use interpolate()-

    s.interpolate()
    

    Output

    0     NaN
    1    72.0
    2    63.0
    3    30.0
    4    26.0
    5    29.0
    6    32.0
    7    35.0
    8    36.0
    9    37.0
    Name: col2, dtype: float64