Using mutate_()
I used to provide a list of new variables and the logic needed to create them.
library(dplyr)
library(rlang)
list_new_var <-
list(new_1 = "am * mpg",
new_2 = "cyl + disp")
mtcars %>%
mutate_(.dots = list_new_var) %>%
head()
Now I want to transition to using tidy evaluation. I am in the process of understanding the new methods.
How can I make this work? Will a function generally be recommended to solve this type of situation?
f_mutate <- function(data, new) {
a <- expr(new)
b <- eval(new)
c <- syms(new)
d <- UQ(syms(new))
e <- UQS(syms(new))
f <- UQE(syms(new))
data %>%
mutate(f) %>%
head()
}
f_mutate(mtcars, new = list_new_var)
To update the previous answer somewhat with the latest rlang constructs, four alternative but equivalent solutions:
Instead of using base R's quote()
in a list you can use rlang's exprs()
function for quoting the expressions:
list_new_var <- exprs(
new_1 = am * mpg,
new_2 = cyl + disp
)
The function and the call stay the same:
f_mutate <- function(data, new) {
data %>%
mutate(!!!new)
}
f_mutate(mtcars, new = list_new_var) %>%
head
To leverage rlang's interface principles using non-standard evaluation, tidy eval and data-masking we can also do this:
f_mutate <- function(data, new_1, new_2) {
data %>%
mutate(new_1 = {{ new_1 }}, new_2 = {{ new_2 }})
}
f_mutate(mtcars, new_1 = am * mpg, new_2 = cyl + disp) %>%
head
And we can even make the function more flexible by allowing a varying number of input arguments, using dots, like a true Tidyverse function:
f_mutate <- function(data, ...) {
new <- exprs(...)
data %>%
mutate(!!!new)
}
f_mutate(mtcars, new_1 = am * mpg, new_2 = cyl + disp) %>%
head
Notice there is no need for the enriched enexprs()
. Instead, exprs()
suffices because dots are already a forwarding syntax.
Because the dots are already passed to the function as-is, we actually don't even need an in-between variable new
. Instead we can just do:
f_mutate <- function(data, ...) {
data %>%
mutate(...)
}
f_mutate(mtcars, new_1 = am * mpg, new_2 = cyl + disp) %>%
head
...upon which we realise we have just wrapped the mutate()
function, which is not all that useful, unless we let it do other things too. That is why an in-between variable new
from the previous solution is interesting, because we can inject more things into it.
Also notice that for these last two solutions you can even omit new_1 =
if desired, in which case the column name will be derived from the expression (i.e. am * mpg
).
All of this shows how incredibly flexible R is in meta-programming, made possible in a large part due to R's lazy evaluation principles.