I'm working with a data frame of 1501 x 35 values, and the data is like in the following table:
Date | 1 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|
10/02/20 | 0.04919382 | 0.04962555 | 0.04579872 | 0.0354689 | 0.048592 |
20/05/20 | 0.04909930 | 0.04957330 | 0.0458772 | 0.04741 | 0.052167 |
12/08/20 | 0.04909930 | 0.04957330 | 0.04525272 | 0.035544 | 0.045489 |
18/10/20 | 0.04915135 | 0.04957330 | 0.047822 | 0.03485484 | 0.024452 |
I want to obtain the data for "column 2", by interpolating.
I tried with result<- approx(data$1, data$2, xout = data$2, method = 'linear')
but it didn't work, the results are wrong. I also tried with approxfun
but it returns 50 values instead of 1501. How can I do a right linear interpolation?
Thanks a lot.
I believe what you want is this. At the moment there's nothing to interpolate, you need a NA
column first. You could append
one after the second position.
(d <- as.data.frame(append(d, list(X2=NA), 2)))
# Date X1 X2 X3 X4 X5 X6
# 1 2020-02-10 0.04919382 NA 0.04962555 0.04579872 0.03546890 0.048592
# 2 2020-05-20 0.04909930 NA 0.04957330 0.04587720 0.04741000 0.052167
# 3 2020-08-12 0.04909930 NA 0.04957330 0.04525272 0.03554400 0.045489
# 4 2020-10-18 0.04915135 NA 0.04957330 0.04782200 0.03485484 0.024452
Now you want to apply
the approx
function row-wise, i.e. with MARGIN=1
. The logic is, that you feed it with a sequence of length of the values to interpolate, i.e. of the x
of each apply
iteration which is seq(x)
as well as the values containing the NA
s which is x
itself. From the output you want the y
, and, because it is the whole (transposed) matrix, just row [2,]
.
d$X2 <- apply(d[-1], MARGIN=1, function(x) approx(seq(x), x, seq(x))$y)[2,]
d
# Date X1 X2 X3 X4 X5 X6
# 1 2020-02-10 0.04919382 0.04940968 0.04962555 0.04579872 0.03546890 0.048592
# 2 2020-05-20 0.04909930 0.04933630 0.04957330 0.04587720 0.04741000 0.052167
# 3 2020-08-12 0.04909930 0.04933630 0.04957330 0.04525272 0.03554400 0.045489
# 4 2020-10-18 0.04915135 0.04936232 0.04957330 0.04782200 0.03485484 0.024452
Data:
d <- structure(list(Date = structure(c(18302, 18402, 18486, 18553), class = "Date"),
X1 = c(0.04919382, 0.0490993, 0.0490993, 0.04915135), X3 = c(0.04962555,
0.0495733, 0.0495733, 0.0495733), X4 = c(0.04579872, 0.0458772,
0.04525272, 0.047822), X5 = c(0.0354689, 0.04741, 0.035544,
0.03485484), X6 = c(0.048592, 0.052167, 0.045489, 0.024452
)), row.names = c(NA, -4L), class = "data.frame")