I've been reading through the tidy evaluation book, several questions, and a few other resources and I still can't find a straight answer to this.
Suppose I have a formula object in R, say x ~ y + z
, and a function f
, that I define like the following.
f <- function(.data, a_formula) { .data %>% select(SOMETHING HERE) }
I want evaluating f(df, a_formula)
to give the same result as df %>% select(x, y, z)
without having to manually specify the formula components.
I know rlang
has a few functions to deal with formulas, and there is all.vars()
from base R, but I am not sure what I need to put in the internal select()
function within f()
for this to evaluate how I want. Is the best answer to just do basic string manipulations until I get what I want? I would prefer if there was some way for the tidy evaluation stuff to handle this behind the scenes, so, for example, f(df, formula(x ~ .))
would result in f %>% select(x, everything())
as usual for dplyr
.
So basically what I am asking is, is there a way for me to extract the variables from a formula and have select()
evaluate them as normal?
You've basically already found all the parts you need, you just need to put them together with the all_of()
helper function. For example
library(dplyr)
f <- function(.data, a_formula) {
.data %>% select(all_of(all.vars(a_formula)))
}
# test with built-in iris data frame
f(iris, Sepal.Length ~ Sepal.Width)
If you want to do something special for .
, then just add that in
f <- function(.data, a_formula) {
if ("." %in% all.vars(a_formula)) return(.data)
.data %>% select(all_of(all.vars(a_formula)))
}
f(iris, Sepal.Length ~ .)