Search code examples
rtidyversetibbletime

How to create a time series with tibbletime with parameterized dates?


I want to create a time series with tibbletime for specific dates. I have:

Data_Start<-"2015-09-07 01:55:00 UTC"
Data_End<-"2015-09-10 01:59:00 UTC"

and I want to create a timeseries, with minute samples, like with:

create_series(2015-09-07 + 01:55:00 ~ 2015-09-10 + 01:59:00,1~M)

The parameters should be a time_formula, described on page 17 here: https://cran.r-project.org/web/packages/tibbletime/tibbletime.pdf

This works, but I cannot pass the parameters like:

create_series(Data_Start~Data_End,1~M)

Tried out already different things for converting the string, but nothing worked so far :(


Solution

  • Author of tibbletime here. An issue on GitHub was raised about this recently. The solution is to use rlang::new_formula() to pre-construct the formula. We also need a special helper function that can handle adding the + in the formula if using POSIXct dates.

    Here is the helper:

    # Time formula creator
    # Can pass character, Date, POSIXct
    create_time_formula <- function(lhs, rhs) {
    
      if(!inherits(lhs, c("character", "Date", "POSIXct"))) {
        stop("LHS must be a character or date")
      }
      if(!inherits(rhs, c("character", "Date", "POSIXct"))) {
        stop("RHS must be a character or date")
      }
    
      if(inherits(lhs, "Date")) {
        lhs <- as.character(lhs)
      } else if (inherits(lhs, "POSIXct")) {
        lhs <- gsub(" ", " + ", lhs)
      }
    
      if(inherits(rhs, "Date")) {
        rhs <- as.character(rhs)
      } else if (inherits(rhs, "POSIXct")) {
        rhs <- gsub(" ", " + ", rhs)
      }
    
      rlang::new_formula(lhs, rhs)
    }
    

    Use the helper function with date versions of your start and end dates

    Data_Start<- as.POSIXct("2015-09-07 01:55:00")
    Data_End  <- as.POSIXct("2015-09-10 01:59:00")
    
    time_formula <- create_time_formula(Data_Start, Data_End)
    
    create_series(time_formula, 1~M, tz = "UTC")
    

    Produces:

    # A time tibble: 4,325 x 1
    # Index: date
                      date
                    <dttm>
     1 2015-09-07 01:55:00
     2 2015-09-07 01:56:00
     3 2015-09-07 01:57:00
     4 2015-09-07 01:58:00
     5 2015-09-07 01:59:00
     6 2015-09-07 02:00:00
     7 2015-09-07 02:01:00
     8 2015-09-07 02:02:00
     9 2015-09-07 02:03:00
    10 2015-09-07 02:04:00
    # ... with 4,315 more rows
    

    In a future release of tibbletime I will likely include a more robust verson of the create_time_formula() helper function for this case.


    Update: tibbletime 0.1.0 has been released, and a more robust implementation allows for directly using variables in the formula. Additionally, each side of the formula must be a character or an object of the same class as the index now (i.e. 2013 ~ 2014 should be "2013" ~ "2014").

    library(tibbletime)
    
    Data_Start<- as.POSIXct("2015-09-07 01:55:00")
    Data_End  <- as.POSIXct("2015-09-10 01:59:00")
    
    create_series(Data_Start ~ Data_End, "1 min")
    #> # A time tibble: 4,325 x 1
    #> # Index: date
    #>    date               
    #>    <dttm>             
    #>  1 2015-09-07 01:55:00
    #>  2 2015-09-07 01:56:00
    #>  3 2015-09-07 01:57:00
    #>  4 2015-09-07 01:58:00
    #>  5 2015-09-07 01:59:00
    #>  6 2015-09-07 02:00:00
    #>  7 2015-09-07 02:01:00
    #>  8 2015-09-07 02:02:00
    #>  9 2015-09-07 02:03:00
    #> 10 2015-09-07 02:04:00
    #> # ... with 4,315 more rows