Search code examples
rrlangtidyeval

A simple example of tidy evaluation for formulas


I am trying to grasp tidy evaluation from rlang. As a short example I would like to add a column of predictions to a data frame. This is implemented in modelr, but I wanted to pass the formula directly so I could practice some tidy evaluation.

I have the following function

add_predictions <- function(data, model_exp){
  enquo_model_exp <- enquo(model_exp)
  fit <- data %>% as_tibble()  %>% !!enquo_model_exp
  data[[preds]] <- stats::predict(fit, data)
}

The above function has the following steps

  1. enquo the formula

  2. fit a model with the data and unqoute the formula with !!

  3. predict using the fitted model on the data

an example of this functions usage would be something like the following.

cars %>% 
  as_tibble() %>% 
  add_predictions(lm(speed ~ dist, data = .))

Solution

  • Passing formulas as arguments is straightforward and I wouldn't recommend tidy evaluation for that. I would do this as follows (using just a bit of tidyeval for the new column name):

    library(tidyverse)
    
    add_predictions <- function(.data, formula,
                                .fun = lm, col = pred) {
      col <- enquo(col)
      col <- quo_name(col)
      mod <- .fun(formula = formula, data = .data)
      mutate(.data, !! col := predict(mod))
    }
    
    cars %>% 
      add_predictions(speed ~ dist, col = speed_pred) 
    
    #    speed dist speed_pred
    # 1      4    2   8.615041
    # 2      4   10   9.939581
    # 3      7    4   8.946176
    # 4      7   22  11.926392
    # 5      8   16  10.932987
    # 6      9   10   9.939581
    # 7     10   18  11.264122
    # 8     10   26  12.588663
    # 9     10   34  13.913203
    # 10    11   17  11.098554
    # ...
    

    Now I understand that you want to use tidy evaluation as an exercise. Using your desired function signature:

    add_predictions_2 <- function(.data, model_exp, col = pred) {
      col <- enquo(col)
      col <- quo_name(col)
      model_exp <- enquo(model_exp)
      mod <- rlang::eval_tidy(model_exp, data = list(. = .data))
      mutate(.data, !! col := predict(mod))
    }
    
    cars %>% 
      as_tibble() %>% 
      add_predictions_2(lm(speed ~ dist, data = .))
    
    # # A tibble: 50 x 3
    #    speed  dist  pred
    #    <dbl> <dbl> <dbl>
    #  1     4     2  8.62
    #  2     4    10  9.94
    #  3     7     4  8.95
    #  4     7    22 11.9 
    #  5     8    16 10.9 
    #  6     9    10  9.94
    #  7    10    18 11.3 
    #  8    10    26 12.6 
    #  9    10    34 13.9 
    # 10    11    17 11.1 
    # # ... with 40 more rows