Search code examples
rcran

R best practices : for CRAN, should you optimize or keep the base methods


I'm writing for my package on CRAN, on the way to optimize the speed.

I've seen one main problem, which is that the "base" (stats actually) methods for time series are quite slow, especially when you work with same tsp.

set.seed(1)
a <- ts(rnorm(480),start=2010,freq=12)
b <- ts(rnorm(480),start=2010,freq=12)
library(microbenchmark)

ts_fastop <- function(x,y,FUN) {
  FUN <- match.fun(FUN)
  tspx <- tsp(x)
  if (any(abs(tspx - tsp(y)) > getOption("ts.eps"))) stop("This method is only made for similar tsp", call. = FALSE)
  ts(FUN(as.numeric(x),as.numeric(y)),start=tspx[1L],frequency = tspx[3L])
}
identical(ts_fastop(a,b,`+`),a+b)
# [1] TRUE
microbenchmark(ts_fastop(a,b,`+`),a+b,times=1000L)
# Unit: microseconds
#                  expr   min    lq     mean median    uq    max neval
#  ts_fastop(a, b, `+`)  13.7  15.3  24.1260   17.4  18.9 6666.4  1000
#                 a + b 364.5 372.5 385.7744  375.6 380.4 7218.4  1000

I think that 380 microseconds, for a simple + on a few vars, is way too much.

However, as I was shortcuting these methods, I wonder what's the best practices :

  • if anyone shortcuts main functions, I guess it makes it less easy for R core team to manage upgrades
  • the readability of the source is better if it is written a+b than ts_fastop(a,b,+)

So what is anything advised regarding that ?

Thanks


Solution

  • Define a subclass of ts in which case both can coexist. This should work for constructing fast_ts objects from ts objects, plain vectors, zoo and xts objects and others for which an as.ts method exists.

    as.fast_ts <- function(x, ...) UseMethod("as.fast_ts")
    as.fast_ts.fast_ts <- identity
    as.fact_ts.default <- function(x, ...) structure(as.ts(x, ...), 
      class = c("fast_ts", "ts"))
    
    Ops.fast_ts <- function(e1, e2) {
       result <- match.fun(.Generic)(c(e1), c(e2))
       structure(result, tsp = tsp(e1), class = c("fast_ts", "ts"))
    }
    
    # test
    
    set.seed(1)
    a <- ts(rnorm(480),start=2010,freq=12)
    b <- ts(rnorm(480),start=2010,freq=12)
    af <- as.fast_ts(a)
    bf <- as.fast_ts(b)
    
    library(microbenchmark)
    microbenchmark(a+b, af+bf)