Search code examples
rdplyrrlang

Passing multiple lists of arguments through rlang quosures and dplyr


Objective: write a function that will take a data frame as its first argument, and then two additional arguments, lists of argments, passed to dplyr::select such that the function will return two data frames.

Here is a working example

my_select <- function(.data, df1, df2) {
  DF1 <- dplyr::select(.data, rlang::UQS(df1))
  DF2 <- dplyr::select(.data, rlang::UQS(df2))
  list(DF1 = DF1, DF2 = DF2)
}

working_eg <-
  my_select(mtcars,
          df1 = alist(dplyr::contains("r"), dplyr::matches("^.p.*")),
          df2 = alist(disp))

str(working_eg, max.length = 1L)

## List of 2
##  $ DF1:'data.frame': 32 obs. of  5 variables:
##   ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##   ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
##   ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
##   ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##   ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
##  $ DF2:'data.frame': 32 obs. of  1 variable:
##   ..$ disp: num [1:32] 160 160 108 258 360 ...

I would prefer that the argments df1 and df2 take a list not a alist. however, the function my_select will fail if the arguments are list

my_select(mtcars,
          df1 = list(dplyr::contains("r"), dplyr::matches("^.p.*")),
          df2 = list(disp))
## Error: Variable context not set

I don't want to ask end users to use alist, if possible, as I don't have a good way to test that alist was used to pass arguments instead of list.

I've tried several combinations of rlang::UQ, rlang::UQE, rlang::UQS with rlang::quo, rlang::enquos, and rlang::quos to fix this. What I thought was the best approach was:

my_select2 <- function(.data, df1, df2) {
  DF1 <- dplyr::select(.data, rlang::UQS(rlang::quos(df1)))
  DF2 <- dplyr::select(.data, rlang::UQS(rlang::quos(df2)))
  list(DF1 = DF1, DF2 = DF2)
}

my_select2(mtcars,
          df1 = list(dplyr::contains("r"), dplyr::matches("^.p.*")),
          df2 = list(disp))
## Error: `df1` must resolve to integer column positions, not a list

Is there a way to use the rlang package with dplyr so that the syntax of my_select2 will return the same object as my_select does when arguments are passed via alist?

packageVersion("dplyr")
# [1] ‘0.7.4’
packageVersion("rlang")
# [1] ‘0.1.2’

Solution

  • @MrFlick's comment was the reminder I needed to solve the problem and improve the api.

    my_select3 <- function(.data, df1, df2) {
      DF1 <- dplyr::select(.data, rlang::UQS((df1)))
      DF2 <- dplyr::select(.data, rlang::UQS((df2)))
      list(DF1 = DF1, DF2 = DF2)
    }
    
    working_eg2 <-
      my_select3(mtcars,
                df1 = dplyr::vars(dplyr::contains("r"), dplyr::matches("^.p.*")),
                df2 = dplyr::vars(disp))
    
    all.equal(working_eg, working_eg2)
    # [1] TRUE