Search code examples
rdatabaselinear-interpolation

Interpolate to obtaining certain value in R


I'm working with a data frame of 1501 x 35 values, and the data is like in the following table:

Date 1 3 4 5 6
10/02/20 0.04919382 0.04962555 0.04579872 0.0354689 0.048592
20/05/20 0.04909930 0.04957330 0.0458772 0.04741 0.052167
12/08/20 0.04909930 0.04957330 0.04525272 0.035544 0.045489
18/10/20 0.04915135 0.04957330 0.047822 0.03485484 0.024452

I want to obtain the data for "column 2", by interpolating.

I tried with result<- approx(data$1, data$2, xout = data$2, method = 'linear')

but it didn't work, the results are wrong. I also tried with approxfun but it returns 50 values instead of 1501. How can I do a right linear interpolation?

Thanks a lot.


Solution

  • I believe what you want is this. At the moment there's nothing to interpolate, you need a NA column first. You could append one after the second position.

    (d <- as.data.frame(append(d, list(X2=NA), 2)))
    #         Date         X1 X2         X3         X4         X5       X6
    # 1 2020-02-10 0.04919382 NA 0.04962555 0.04579872 0.03546890 0.048592
    # 2 2020-05-20 0.04909930 NA 0.04957330 0.04587720 0.04741000 0.052167
    # 3 2020-08-12 0.04909930 NA 0.04957330 0.04525272 0.03554400 0.045489
    # 4 2020-10-18 0.04915135 NA 0.04957330 0.04782200 0.03485484 0.024452
    

    Now you want to apply the approx function row-wise, i.e. with MARGIN=1. The logic is, that you feed it with a sequence of length of the values to interpolate, i.e. of the x of each apply iteration which is seq(x) as well as the values containing the NAs which is x itself. From the output you want the y, and, because it is the whole (transposed) matrix, just row [2,].

    d$X2 <- apply(d[-1], MARGIN=1, function(x) approx(seq(x), x, seq(x))$y)[2,]
    d
    #         Date         X1         X2         X3         X4         X5       X6
    # 1 2020-02-10 0.04919382 0.04940968 0.04962555 0.04579872 0.03546890 0.048592
    # 2 2020-05-20 0.04909930 0.04933630 0.04957330 0.04587720 0.04741000 0.052167
    # 3 2020-08-12 0.04909930 0.04933630 0.04957330 0.04525272 0.03554400 0.045489
    # 4 2020-10-18 0.04915135 0.04936232 0.04957330 0.04782200 0.03485484 0.024452
    

    Data:

    d <- structure(list(Date = structure(c(18302, 18402, 18486, 18553), class = "Date"), 
        X1 = c(0.04919382, 0.0490993, 0.0490993, 0.04915135), X3 = c(0.04962555, 
        0.0495733, 0.0495733, 0.0495733), X4 = c(0.04579872, 0.0458772, 
        0.04525272, 0.047822), X5 = c(0.0354689, 0.04741, 0.035544, 
        0.03485484), X6 = c(0.048592, 0.052167, 0.045489, 0.024452
        )), row.names = c(NA, -4L), class = "data.frame")