R fastest bivariate regression slope coefficient

I have tried a number linear regressions, and though the standard ones are all good (e.g., is really nicely fast), the fastLM in gets the cake. It is truly amazingly fast. Thanks Doug, Dirk, Romain, and Yixuan.

alas, there is a note in its documentation that the special form of a bivariate regression could be done even faster. If I just need the slope, should I use the native cov(x,y)/var(x) in R, or should I write this in Rcpp, or ...?


  • Seems like wins for this case, although it could depend on the size of your data set ... I don't know if you could do even better with something Rcpp-ish — if you want to do lots of really small regressions various kinds of function-calling overheads are going to get important ... (fastR has a fast covariance calculator, but it says it is competitive in/intended for the high-dimensional case ...)

    simfun <- function(n = 100) {
       data.frame(y = rnorm(n), x = rnorm(n))
    dd <- simfun()
    mylm <- function(x, y) { v <- var(cbind(x,y)); v[2,1]/v[1,1] }
    mylm2  <- function(x, y) { cov(x,y) / var(x) }
    bench::mark(, x), y)$coefficients[2],, x), y)$coefficients[2],
       fastLmPure(cbind(1, x), y)$coefficients[2],
       mylm(x, y),
       mylm2(x, y),
      check = FALSE 
      expression     min  median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
      <bch:expr> <bch:t> <bch:t>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm>
    1… 24.68µs 26.45µs    36555.    7.34KB     14.6  9996     4    273.4ms
    2…  5.86µs  6.67µs   144591.    4.88KB     14.5  9999     1     69.2ms
    3 fastLmPur… 12.85µs 13.99µs    70384.    3.27KB     14.1  9998     2      142ms
    4 mylm(x, y) 10.05µs 11.16µs    86832.    1.61KB     26.1  9997     3    115.1ms
    5 mylm2(x, … 22.68µs 24.32µs    40539.        0B     20.3  9995     5    246.6ms
    # ℹ 4 more variables: result <list>, memory <list>, time <list>, gc <list>