Search code examples
rlazy-evaluationevaluationtidyeval

How do I return a data-variable in an R function?


What I am trying to do

I am trying to write a function that returns the names of certain variables of a dataset. For a test tibble test <- tibble(x1 = 1:3, x2=2:4, x3=3:5, x4=4:6), I want a function

assign_predictors_argument <- function(dataset, outcome, predictors) {
  ...
}

such that:

  1. if the argument predictors is not defined, predictors will be set to all variables in dataset apart from outcome. E.g. assign_predictors_argument(test, x1) will return c(x2, x3, x4).
  2. if the argument predictors is defined, will return that value. E.g. assign_predictors_argument(test, x1, c(x2, x3)) will return c(x2, x3).

What I have tried

assign_predictors_argument <- function(dataset, outcome, predictors) {
  if(missing(predictors)) {
    predictors <- dataset %>%
      dplyr::select( -{{ outcome }} ) %>%
      names()
  }
  predictors
}

What went wrong

Case 1: predictors argument missing

assign_predictors_argument(test, x1) gives the result "x2" "x3" "x4". However, I want this to return c(x2,x3, x4).

How do I convert this character vector to a form like the input?

Case 2: predictors argument defined

assign_predictors_argument(test, x1, c(x2, x3)) gives

Error in assign_predictors_argument(test, x1, x2) : 
  object 'x2' not found

It appears that the last line of the function tries to evaluate and return predictors. As x3 is not defined in the environment, this brings an error.

I have tried a) changing the final line to {{predictors}} as well as b) changing missing(predictors) to is.null(predictors) and putting in a default predictors = NULL (following this). Neither have worked.

How can I return the value of predictors without either a) changing its form or b) evaluating it?


Solution

  • You were close:

    assign_predictors_argument <- function(dataset, outcome, predictors) {
      if(missing(predictors)) {
        dataset %>%
          dplyr::select( -{{ outcome }} ) %>%
          names() %>%
          {rlang::expr( c(!!!syms(.)) )}
      }
      else rlang::enexpr(predictors)
    }
    
    assign_predictors_argument(test, x1)
    # c(x2, x3, x4)
    assign_predictors_argument(test, x1, c(x2, x3))
    # c(x2, x3)
    

    In the above, rlang::expr() constructs the expression that you want by 1) converting names to symbols with syms() and 2) splicing them together inside the c(...) expression with the unquote-splice operator !!!.

    For the second portion, you can simply capture the expression supplied by the user with rlang::enexpr().