Search code examples
rdataframetime-seriesinterpolationzoo

Time Series Interpolation


I have two series of data (calibration and sample) and am trying to interpolate the calibration data from monthly to the frequency of the sample which randomly changes between minutes to seconds.

I tried this (Interpolating timeseries) and here's my code:

     require(zoo)
     
     zc <- zoo(calib$MW2, calib$Date)    
     zs <- zoo(sample$MW.2, sample$DateMW.2)

     z <- merge(zc, zs)

     zc <- zoo(calib$MW2, calib$Date)   
     zs <- zoo(sample$MW.2, sample$DateMW.2)  
     
     # "merge" gets data frames only
     zc <- data.frame(zc)
     zs <- data.frame(zs)

     z <- merge(zc, zs)

     z$zc <- na.approx(z$zc, rule=2)

     df <- z[index(zs),]

Note: Convert outputs of zoo to data.frame (zc and zs) before merging.

The problem is that instead of interpolation, it just repeats the calibration data-set; You can take a look at the part of the supposedly interpolated df and compare it to the original data above to confirm what I say;

     > df
        zc       zs                date     
     1     60.84440 61.40373 2016-06-02 18:15:00
     2     58.85957 61.40373 2016-06-02 18:30:00
     3     57.49543 61.40373 2016-06-02 18:45:00
     4     56.32829 61.40373 2016-06-02 19:00:00
     5     56.84261 61.40373 2016-06-02 19:15:00
     6     57.76762 61.40373 2016-06-02 19:30:00
     7     59.58310 61.40373 2016-06-02 19:45:00
     8     59.95826 61.40373 2016-06-02 20:00:00
     9     60.84440 61.41549 2016-06-02 20:15:00
     10    58.85957 61.41549 2016-06-02 20:30:00
     11    57.49543 61.41549 2016-06-02 20:45:00
     12    56.32829 61.41549 2016-06-02 21:00:00

#Data:

     sample <- structure(list(DateMW.2 = structure(1:15, .Label = c("6/2/2016 18:15:00", 
     "6/2/2016 18:30:00", "6/2/2016 18:45:00", "6/2/2016 19:00:00", 
     "6/2/2016 19:15:00", "6/2/2016 19:30:00", "6/2/2016 19:45:00", 
     "6/2/2016 20:00:00", "6/2/2016 20:15:00", "6/2/2016 20:30:00", 
     "6/2/2016 20:45:00", "6/2/2016 21:00:00", "6/2/2016 21:15:00", 
     "6/2/2016 21:30:00", "6/2/2016 21:45:00"), class = "factor"), 
     MW.2 = c(61.40373, 61.41549, 61.41549, 61.42451, 61.42752, 
     61.42478, 61.43107, 61.42369, 61.40564, 61.41056, 61.40592, 
     61.39416, 61.38432, 61.3753, 61.3753)), .Names = c("DateMW.2", 
     "MW.2"), row.names = c(NA, 15L), class = "data.frame")


     calib <- structure(list(Date = structure(c(4L, 5L, 6L, 7L, 8L, 1L, 2L, 
     3L), .Label = c("10/31/2016 12:00:00", "11/30/2016 12:00:00", 
     "12/31/2016 12:00:00", "5/31/2016 12:00:00", "6/30/2016 12:00:00", 
     "7/31/2016 12:00:00", "8/31/2016 12:00:00", "9/30/2016 12:00:00"
     ), class = "factor"), MW2 = c(60.844402, 58.859566, 57.495434, 
     56.328285, 56.842606, 57.76762, 59.583103, 59.958263)), .Names = c("Date",     
     "MW2"), class = "data.frame", row.names = c(NA, -8L))

Solution

  • If your data-set is already formatted as date-time you don't need to struggle with using zoo. Here, I simply used approx function and it gave me exactly what I wanted. You can get the data-set from the question to reproduce the code.

           ipc <- approx(calib$Date,calib$MW2, xout = sample$`DateMW-2`, 
           rule = 1, method = "linear", ties = mean)
    

    You can see that the data is being interpolated linearly between the given data points.

    enter image description here

    Thanks for your insightful comments.