Search code examples
pythonpandasinterpolationmissing-data

Filling missing data by interpolation in Python


I have a pandas dataframe which looks like this:

           Date and Time      Seconds  Pressure (mmHg)  Temperature (C)
0    2021-05-13 13:00:00        0.000          709.719           26.551
1    2021-05-13 14:00:00     3600.001          709.364           25.966
2    2021-05-13 15:00:00     7200.001          708.698           25.331
3    2021-05-13 16:00:00    10800.001          707.689           25.184
4    2021-05-13 17:00:00    14400.001          707.206           25.184

The pressure and temperature data meant to be in 15 minutes intervals but the sensor setting was wrong and collected data ever hour. Assuming linear interpolation, how to expand data timestamp to 15-minutes intervals and fill missing data between hours with liner interpolations? I tried the solution suggested here, but my files are huge and there are many of them . This solution is not very fast.


Solution

  • Use DataFrame.resample with Resampler.first for missing values between hours and then DataFrame.interpolate:

    df['Date and Time'] = pd.to_datetime(df['Date and Time'])
    
    df = (df.resample('15Min', on='Date and Time')[['Pressure (mmHg)','Temperature (C)']]
           .first()
           .interpolate())
    print (df)
    
                         Pressure (mmHg)  Temperature (C)
    Date and Time                                        
    2021-05-13 13:00:00        709.71900         26.55100
    2021-05-13 13:15:00        709.63025         26.40475
    2021-05-13 13:30:00        709.54150         26.25850
    2021-05-13 13:45:00        709.45275         26.11225
    2021-05-13 14:00:00        709.36400         25.96600
    2021-05-13 14:15:00        709.19750         25.80725
    2021-05-13 14:30:00        709.03100         25.64850
    2021-05-13 14:45:00        708.86450         25.48975
    2021-05-13 15:00:00        708.69800         25.33100
    2021-05-13 15:15:00        708.44575         25.29425
    2021-05-13 15:30:00        708.19350         25.25750
    2021-05-13 15:45:00        707.94125         25.22075
    2021-05-13 16:00:00        707.68900         25.18400
    2021-05-13 16:15:00        707.56825         25.18400
    2021-05-13 16:30:00        707.44750         25.18400
    2021-05-13 16:45:00        707.32675         25.18400
    2021-05-13 17:00:00        707.20600         25.18400