I am unsure as to how exactly the following line of code works, given that without the tilde:
betterLifeIndicators <- betterLifeIndicators |>
rename_with(paste0(.x, ")"))
it doesn't work, citing that the object '.x' is not found. Yet when adding the tilde in front of paste0:
betterLifeIndicators <- betterLifeIndicators |>
rename_with(~paste0(.x, ")"))
it works perfectly. I am quite new to R and I am struggling to understand the mechanism at work here. I am unsure as to how that whole line:
rename_with(~paste0(.x, ")"))
is achieving its purpose (appending a parenthesis to every column name in a data.frame).
This is a flaw in my understanding of R, but I would appreciate any help with understanding this whole situation since I would hate to move forward with my project while leaving such a hole in my understanding of the language.
Eh, this is some tidyverse jungle, but you can try to think about it like so:
In standard R there is a type of object called "formula" which is created with a tilde like so:
f <- ~ anythinig + can + be + here + paste(1, 2, 3)
We can check what is the class of this object and we get:
> class(f)
[1] formula
Then, rename_with()
is a function. It needs to be passed at least 2 arguments: 1) a data.frame, 2) a function that renames the columns:
> rename_with(iris, toupper)
SEPAL.LENGTH SEPAL.WIDTH PETAL.LENGTH PETAL.WIDTH SPECIES
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
In R functions can choose what to do with their arguments. We can try to inspect what this particular function does:
> rename_with
function (.data, .fn, .cols = everything(), ...)
{
UseMethod("rename_with")
}
<bytecode: 0x7faad65f8510>
<environment: namespace:dplyr>
Not that useful, it tells us that this function is a generic function, the real body of it is hidden in some other place. We can find it here:
> dplyr:::rename_with.data.frame
function (.data, .fn, .cols = everything(), ...)
{
.fn <- as_function(.fn)
cols <- tidyselect::eval_select(enquo(.cols), .data, allow_rename = FALSE)
names <- names(.data)
...
Now we see that the first step it does to the second argument (.fn) is transform it into a function with as_function(.fn)
.
This as_function()
function is in another tidyverse package - "rlang" and we can find it there:
> rlang::as_function
function (x, env = global_env(), ..., arg = caller_arg(x), call = caller_env())
{
check_dots_empty0(...)
if (is_function(x)) {
return(x)
}
if (is_quosure(x)) {
mask <- eval_tidy(call2(environment), env = quo_get_env(x))
fn <- new_function(pairlist2(... = ), quo_get_expr(x), mask)
return(fn)
}
if (is_formula(x)) {
if (length(x) > 2) {
...
The relevant part for us is this is_formula
call in the end of the part of output I showed. So basically rename_with
has some clever(?) way to turn a formula into a function. But you can achieve the same thing by passing your own function:
iris |> rename_with(function(x) paste0(x, ")"))
Sepal.Length) Sepal.Width) Petal.Length) Petal.Width) Species)
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
The approach with tilde (creating a formula that is then turned into a function within rename_with
) is just some syntax sugar. Under the hood the code then turns this formula into a function.
And your first approach:
betterLifeIndicators <- betterLifeIndicators |>
rename_with(paste0(.x, ")"))
didn't work, because in your call the argument paste0(.x, ")")
is not yet in a final form - it needs to be evaluated, R interpreter tries to evaluate this statement first but fails to find .x
and shows you an error.