Search code examples
rdplyrnserlangtidyeval

How to evaluate a constructed string with non-standard evaluation using dplyr?


I have read several guides on programming with dplyr now and I am still confused about how to solve the problem of evaluating constructed/concatenated strings with non-standard evaluation (NSE). I realize that there are better ways to solve this example than using NSE, but want to learn how to.

t <- tibble( x_01 = c(1, 2, 3), x_02 = c(4, 5, 6))
i <- 1

This is my desired outcome but want the variables in mutate() to be constructed:

t %>% mutate(d_01 = x_01 * 2)
#>   A tibble: 3 x 3
#>   x_01  x_02  d_01
#>   <dbl> <dbl> <dbl>
#> 1  1.00  4.00  2.00
#> 2  2.00  5.00  4.00
#> 3  3.00  6.00  6.00

This is my first attempt, trying to use strings:

new <- sprintf("d_%02d", i)
var <- sprintf("x_%02d", i)
t %>% mutate(new = var * 2)
#> Error in mutate_impl(.data, dots) : 
#> Evaluation error: non-numeric argument to binary operator.

This is my second attempt, trying to use quosures:

new <- rlang::quo(sprintf("d_%02d", i))
var <- rlang::quo(sprintf("x_%02d", i))
t %>% mutate(!!new = !!var * 2)
#> Error: unexpected '=' in "t %>% mutate(!!new ="

This is my third attempt, trying to use quosures and the := operator:

new <- rlang::quo(sprintf("d_%02d", i))
var <- rlang::quo(sprintf("x_%02d", i))
t %>% mutate(!!new := !!var * 2)
#> Error in var * 2 : non-numeric argument to binary operator

Solution

  • Use sym and := like this:

    library(dplyr)
    library(rlang)
    
    t <- tibble( x_01 = c(1, 2, 3), x_02 = c(4, 5, 6))
    i <- 1
    
    new <- sym(sprintf("d_%02d", i))
    var <- sym(sprintf("x_%02d", i))
    t %>% mutate(!!new := (!!var) * 2)
    

    giving:

    # A tibble: 3 x 3
       x_01  x_02  d_01
      <dbl> <dbl> <dbl>
    1     1     4     2
    2     2     5     4
    3     3     6     6
    

    Also note that this is trivial in base R:

    tdf <- data.frame( x_01 = c(1, 2, 3), x_02 = c(4, 5, 6))
    i <- 1
    
    new <- sprintf("d_%02d", i)
    var <- sprintf("x_%02d", i)
    tdf[[new]] <- 2 * tdf[[var]]