Search code examples
rfunctiontidyverserlangquosure

using `rlang::exec` with functions that use `rlang::ensym`


I am trying to write a custom function which is a bit more complicated so for the sake of simplicity I have created toy examples.

Let's say I want to write a function that-

  1. automatically decides the appropriate function to run: for example, a t-test or an anova.
  2. accepts both "quoted" and unquoted arguments

So I write a function to run a t-test (works as expected):

set.seed(123)
library(rlang)
library(tidyverse)

# t-test function
fun_t <- function(data, x, y) {
  # make sure both quoted and unquoted arguments work
  x <- rlang::ensym(x)
  y <- rlang::ensym(y)

  # t-test
  broom::tidy(stats::t.test(
    formula = rlang::new_formula({{ y }}, {{ x }}),
    data = data
  ))
}

# works fine
fun_t(mtcars, am, wt)
#> # A tibble: 1 x 10
#>   estimate estimate1 estimate2 statistic p.value parameter conf.low
#>      <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>
#> 1     1.36      3.77      2.41      5.49 6.27e-6      29.2    0.853
#> # ... with 3 more variables: conf.high <dbl>, method <chr>,
#> #   alternative <chr>

fun_t(mtcars, "am", "wt")
#> # A tibble: 1 x 10
#>   estimate estimate1 estimate2 statistic p.value parameter conf.low
#>      <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>
#> 1     1.36      3.77      2.41      5.49 6.27e-6      29.2    0.853
#> # ... with 3 more variables: conf.high <dbl>, method <chr>,
#> #   alternative <chr>

Then I write a function to run an anova (works as expected):

# anova function
fun_anova <- function(data, x, y) {
  # make sure both quoted and unquoted arguments work
  x <- rlang::ensym(x)
  y <- rlang::ensym(y)

  # t-test
  broom::tidy(stats::aov(
    formula = rlang::new_formula({{ y }}, {{ x }}),
    data = data
  ))
}

# works fine
fun_anova(mtcars, cyl, wt)
#> # A tibble: 2 x 6
#>   term         df sumsq meansq statistic      p.value
#>   <chr>     <dbl> <dbl>  <dbl>     <dbl>        <dbl>
#> 1 cyl           1  18.2 18.2        47.4  0.000000122
#> 2 Residuals    30  11.5  0.384      NA   NA

fun_anova(mtcars, "cyl", "wt")
#> # A tibble: 2 x 6
#>   term         df sumsq meansq statistic      p.value
#>   <chr>     <dbl> <dbl>  <dbl>     <dbl>        <dbl>
#> 1 cyl           1  18.2 18.2        47.4  0.000000122
#> 2 Residuals    30  11.5  0.384      NA   NA

Then I write a meta-function to choose the appropriate function from above-

fun_meta <- function(data, x, y) {
  # make sure both quoted and unquoted arguments work
  x <- rlang::ensym(x)
  y <- rlang::ensym(y)

  # which test to run?
  if (nlevels(data %>% dplyr::pull({{ x }})) == 2L) {
    .f <- fun_t
  } else {
    .f <- fun_anova
  }

  # executing the appropriate function
  rlang::exec(
    .fn = .f,
    data = data,
    x = x,
    y = y
  )
}

# using the meta-function
fun_meta(mtcars, am, wt)
#> Only strings can be converted to symbols

fun_meta(mtcars, "am", "wt")
#> Only strings can be converted to symbols

But this doesn't seem to work. Any ideas on what I am doing wrong here and how to get this to work?


Solution

  • It seems like the problem is stemming from passing what amounted to, e.g., x = rlang::ensym(am) to your individual functions via rlang::exec() in your meta function.

    The ensym() function takes only strings or symbols, so doing this led to the error message. Given this, converting your x and y arguments to strings should help.

    So the meta function could be:

    fun_meta <- function(data, x, y) {
         # make sure both quoted and unquoted arguments work
         x <- rlang::ensym(x)
         y <- rlang::ensym(y)
    
         # which test to run?
         if (dplyr::n_distinct(data %>% dplyr::pull({{ x }})) == 2L) {
              .f <- fun_t
         } else {
              .f <- fun_anova
         }
    
         # executing the appropriate function
         rlang::exec(
              .fn = .f,
              data = data,
              x = rlang::as_string(x),
              y = rlang::as_string(y)
         )
    }
    

    (I switched to n_distinct() from nlevels because am and cyl aren't factors and so I wasn't getting the right results to compare to your original results.)

    Now using both bare symbols and strings work:

    fun_meta(mtcars, am, wt)
        # A tibble: 1 x 10
      estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
         <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl>
    1     1.36      3.77      2.41      5.49 6.27e-6      29.2    0.853      1.86
    # ... with 2 more variables: method <chr>, alternative <chr>
    > fun_meta(mtcars, "am", "wt")
    
    fun_meta(mtcars, "am", "wt")
    # A tibble: 1 x 10
      estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
         <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl>
    1     1.36      3.77      2.41      5.49 6.27e-6      29.2    0.853      1.86
    # ... with 2 more variables: method <chr>, alternative <chr>