My data set consists of readings taken at irregular time intervals
df1 <- read.table(text = "DT odczyt.1 odczyt.2
'2023-08-16 00:00:00' 482 1.5
'2023-08-16 23:55:00' 483 4.3
'2023-08-17 05:00:00' 484 1.6
'2023-08-17 23:59:45' 485 5.6
'2023-08-18 00:50:00' 486 1.5
'2023-08-18 23:50:45' 487 6.8
'2023-08-19 00:03:00' 586 1.5
'2023-08-19 05:09:45' 578 7.9
'2023-08-19 17:01:00' 521 11.5
'2023-08-19 18:01:15' 365 8.5
'2023-08-19 22:59:30' 490 1.8
'2023-08-19 23:59:45' 456 6.6
'2023-08-20 00:48:00' 478 1.5
'2023-08-20 03:03:00' 461 1.54
'2023-08-20 05:04:15' 397 1.6
'2023-08-20 19:07:00' 492 6.6
'2023-08-20 23:59:45' 493 3.8
'2023-08-21 01:00:00' 471 9.2
'2023-08-21 07:00:30' 493 8.5
'2023-08-21 13:21:45' 495 5.5", header = TRUE) %>%
mutate (DT = as.POSIXct(DT))
He would like to estimate the values at 00:00 each day. I was thinking about using na.approx
Expected results with one interpolated value:
df1 <- read.table(text = "DT odczyt.1 odczyt.2
'2023-08-16 00:00:00' 482 1.5
'2023-08-16 23:55:00' 483 4.3
'2023-08-17 00:00:00' 483.016 4.256
'2023-08-17 05:00:00' 484 1.6
'2023-08-17 23:59:45' 485 5.6
'2023-08-18 00:00:00'
'2023-08-18 00:50:00' 486 1.5
'2023-08-18 23:50:45' 487 6.8
'2023-08-19 00:00:00'
'2023-08-19 00:03:00' 586 1.5
'2023-08-19 05:09:45' 578 7.9
'2023-08-19 17:01:00' 521 11.5
'2023-08-19 18:01:15' 365 8.5
'2023-08-19 22:59:30' 490 1.8
'2023-08-19 23:59:45' 456 6.6
'2023-08-20 00:00:00'
'2023-08-20 00:48:00' 478 1.5
'2023-08-20 03:03:00' 461 1.54
'2023-08-20 05:04:15' 397 1.6
'2023-08-20 19:07:00' 492 6.6
'2023-08-20 23:59:45' 493 3.8
'2023-08-21 00:00:00'
'2023-08-21 01:00:00' 471 9.2
'2023-08-21 07:00:30' 493 8.5
'2023-08-21 13:21:45' 495 5.5", header = TRUE) %>%
mutate (DT = as.POSIXct(DT))
Using na.approx seems not to be complicated if the dates are supplemented with 00:00
A possible approach:
(fill_gaps
ensures regular intervals before interpolating.)
library(tidyverse)
library(tsibble)
library(zoo)
df1 <- read.table(text = "DT odczyt.1 odczyt.2
'2023-08-16 00:00:00' 482 1.5
'2023-08-16 23:55:00' 483 4.3
'2023-08-17 05:00:00' 484 1.6
'2023-08-17 23:59:45' 485 5.6
'2023-08-18 00:50:00' 486 1.5
'2023-08-18 23:50:45' 487 6.8
'2023-08-19 00:03:00' 586 1.5
'2023-08-19 05:09:45' 578 7.9
'2023-08-19 17:01:00' 521 11.5
'2023-08-19 18:01:15' 365 8.5
'2023-08-19 22:59:30' 490 1.8
'2023-08-19 23:59:45' 456 6.6
'2023-08-20 00:48:00' 478 1.5
'2023-08-20 03:03:00' 461 1.54
'2023-08-20 05:04:15' 397 1.6
'2023-08-20 19:07:00' 492 6.6
'2023-08-20 23:59:45' 493 3.8
'2023-08-21 01:00:00' 471 9.2
'2023-08-21 07:00:30' 493 8.5
'2023-08-21 13:21:45' 495 5.5", header = TRUE) %>%
mutate (DT = as.POSIXct(DT))
df1 |>
as_tsibble(index = DT) |>
fill_gaps() |>
mutate(across(-DT, na.approx)) |>
filter(hour(DT) == 0, minute(DT) == 0, second(DT) == 0)
#> # A tsibble: 6 x 3 [15s] <?>
#> DT odczyt.1 odczyt.2
#> <dttm> <dbl> <dbl>
#> 1 2023-08-16 00:00:00 482 1.5
#> 2 2023-08-17 00:00:00 483. 4.26
#> 3 2023-08-18 00:00:00 485. 5.58
#> 4 2023-08-19 00:00:00 562. 2.80
#> 5 2023-08-20 00:00:00 456. 6.57
#> 6 2023-08-21 00:00:00 493. 3.82
Created on 2024-03-16 with reprex v2.1.0