I have a dataset containing weekly observations from 2020-01-05 to 2022-12-25. It looks like the following:
datetime NO3 NO1 NO2 NO5 NO4 fyllingsgrad
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2020-11-29 0.930 0.972 0.978 0.932 0.903 0.972
2 2021-01-03 0.770 0.821 0.884 0.769 0.785 0.821
3 2020-04-26 0.264 0.196 0.483 0.268 0.322 0.196
4 2021-05-09 0.253 0.151 0.453 0.185 0.430 0.151
5 2021-10-03 0.708 0.761 0.550 0.588 0.809 0.761
6 2022-03-13 0.364 0.190 0.311 0.242 0.505 0.190
7 2021-01-17 0.671 0.698 0.814 0.667 0.730 0.698
8 2021-07-11 0.724 0.772 0.667 0.593 0.731 0.772
9 2020-12-27 0.813 0.862 0.909 0.808 0.811 0.862
10 2021-06-20 0.693 0.717 0.659 0.506 0.648 0.717
I need to interpolate the NO2 variable from the weekly time resolution to hourly. Do anyone know how this could be done?
I have tried multiple things but it doesn´t work. An example is the following code:
hour_dat <- data.frame(
datetime = seq(as.POSIXct("2020-01-05"), as.POSIXct("2022-12-25"), by = "1 hour")
)
new_interp <- hour_dat %>%
mutate(NO2_interp = approx(x = mag_data$datetime, y = mag_data$NO2, xout = datetime, method = "linear")$y)
approx
does only interpolate, not extrapolate, so hour_dat
can only be in ranges of dat
(which probably is the case with your real data). datetime
should be the same POSIXct
format. Just merge
it top dat
and use approx in an
lapply`.
dat$datetime <- as.POSIXct(dat$datetime)
hour_dat <- data.frame(datetime=seq(as.POSIXct("2020-11-29"), as.POSIXct("2021-06-20"), by="1 hour"))
ndat <- merge(dat, hour_dat, all=TRUE)
ndat[-1] <- lapply(ndat[-1], \(y) approx(ndat$datetime, y, xout=ndat$datetime)$y)
Note, that this is linear interpolation.
head(ndat)
# datetime NO3 NO1 NO2 NO5 NO4 fyllingsgrad
# 1 2020-04-26 02:00:00 0.2640000 0.1960000 0.4830000 0.2680000 0.3220000 0.1960000
# 2 2020-11-29 00:00:00 0.9298721 0.9718510 0.9779050 0.9318725 0.9028884 0.9718510
# 3 2020-11-29 01:00:00 0.9300000 0.9720000 0.9780000 0.9320000 0.9030000 0.9720000
# 4 2020-11-29 02:00:00 0.9298259 0.9718363 0.9778973 0.9318155 0.9028631 0.9718363
# 5 2020-11-29 03:00:00 0.9296518 0.9716726 0.9777946 0.9316310 0.9027262 0.9716726
# 6 2020-11-29 04:00:00 0.9294777 0.9715089 0.9776920 0.9314464 0.9025893 0.9715089
We can quickly check graphically
matplot(ndat[-1], type='l', lty=1)
legend('topright', legend=names(ndat)[-1], lty=1, col=1:6, ncol=2, cex=.8, bty='n')
Data:
dat <- structure(list(datetime = c("2020-11-29", "2021-01-03", "2020-04-26",
"2021-05-09", "2021-10-03", "2022-03-13", "2021-01-17", "2021-07-11",
"2020-12-27", "2021-06-20"), NO3 = c(0.93, 0.77, 0.264, 0.253,
0.708, 0.364, 0.671, 0.724, 0.813, 0.693), NO1 = c(0.972, 0.821,
0.196, 0.151, 0.761, 0.19, 0.698, 0.772, 0.862, 0.717), NO2 = c(0.978,
0.884, 0.483, 0.453, 0.55, 0.311, 0.814, 0.667, 0.909, 0.659),
NO5 = c(0.932, 0.769, 0.268, 0.185, 0.588, 0.242, 0.667,
0.593, 0.808, 0.506), NO4 = c(0.903, 0.785, 0.322, 0.43,
0.809, 0.505, 0.73, 0.731, 0.811, 0.648), fyllingsgrad = c(0.972,
0.821, 0.196, 0.151, 0.761, 0.19, 0.698, 0.772, 0.862, 0.717
)), class = "data.frame", row.names = c("1", "2", "3", "4",
"5", "6", "7", "8", "9", "10"))