I refer to the excellent post at
https://purrple.cat/blog/2018/03/02/multiple-lags-with-tidy-evaluation/
What I want to do is to create a function capable, à la dplyr, to generate new columns which are a lagged version of existing columns in a data frame. For instance, you can do this manually as
library(dplyr)
library(rlang)
d2 <- tibble(x1 =1:10, x2=10:19, x3=50:59)
d3 <- d2%>%mutate(x1lag1=lag(x1, 1), x1lag2=lag(x1,2))
but this becomes quickly tedious when you need to take several lags of different columns. One solution in the link above is the following
lags <- function(var, n=10){
var <- enquo(var)
indices <- seq_len(n)
map( indices, ~quo(lag(!!var, !!.x)) ) %>%
set_names(sprintf("lag_%s_%02d", quo_text(var), indices))
}
d4 <- d2 %>%
mutate( !!!lags(x1, 3), !!!lags(x2,3) )
Does anybody know how this could be made more general? I mean that I would like to take a fixed number of lags of a list of columns (x1 and x2, for instance), just by passing the list of columns and without repeating the commands for x1 and x2.
Any suggestion is appreciated.
I think the idea would be to use ...
instead of var
, which would follow closest the spirit of your function.
To do this, it required changing enquo()
to enquos()
, and I use here a crossing
and map2
, but there's probably a more elegant way to do so...
library(tidyverse)
library(rlang)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, as_function, flatten, flatten_chr, flatten_dbl,
#> flatten_int, flatten_lgl, flatten_raw, invoke, list_along,
#> modify, prepend, splice
d <- data_frame(x = seq_len(100),
y = rnorm(100))
#> Warning: `data_frame()` is deprecated, use `tibble()`.
#> This warning is displayed once per session.
multijetlag <- function(data, ..., n=10){
variable <- enquos(...)
indices <- seq_len(n)
combos <- crossing(indices, var =as.list(variable))
quosures <- map2(combos$indices, combos$var,
~quo(lag(!!.y, !!.x)) ) %>%
set_names(paste("lag", combos$indices, map_chr(combos$var, quo_text), sep = "_"))
mutate( data, !!!quosures )
}
multijetlag(d, x, y, n=3)
#> # A tibble: 100 x 8
#> x y lag_1_x lag_1_y lag_2_x lag_2_y lag_3_x lag_3_y
#> <int> <dbl> <int> <dbl> <int> <dbl> <int> <dbl>
#> 1 1 0.213 NA NA NA NA NA NA
#> 2 2 0.277 1 0.213 NA NA NA NA
#> 3 3 -0.517 2 0.277 1 0.213 NA NA
#> 4 4 -0.671 3 -0.517 2 0.277 1 0.213
#> 5 5 -1.12 4 -0.671 3 -0.517 2 0.277
#> 6 6 -0.296 5 -1.12 4 -0.671 3 -0.517
#> 7 7 -1.18 6 -0.296 5 -1.12 4 -0.671
#> 8 8 0.0582 7 -1.18 6 -0.296 5 -1.12
#> 9 9 -0.455 8 0.0582 7 -1.18 6 -0.296
#> 10 10 -0.969 9 -0.455 8 0.0582 7 -1.18
#> # … with 90 more rows
Created on 2019-04-23 by the reprex package (v0.2.1)