Search code examples
rfunctiondplyrrename

What is the purpose of the tilde operator when used to append the paste0() function in R?


I am unsure as to how exactly the following line of code works, given that without the tilde:

betterLifeIndicators <- betterLifeIndicators |>   
rename_with(paste0(.x, ")"))

it doesn't work, citing that the object '.x' is not found. Yet when adding the tilde in front of paste0:

betterLifeIndicators <- betterLifeIndicators |>   
rename_with(~paste0(.x, ")"))

it works perfectly. I am quite new to R and I am struggling to understand the mechanism at work here. I am unsure as to how that whole line:

rename_with(~paste0(.x, ")")) 

is achieving its purpose (appending a parenthesis to every column name in a data.frame).

This is a flaw in my understanding of R, but I would appreciate any help with understanding this whole situation since I would hate to move forward with my project while leaving such a hole in my understanding of the language.


Solution

  • Eh, this is some tidyverse jungle, but you can try to think about it like so:

    In standard R there is a type of object called "formula" which is created with a tilde like so:

    f <- ~ anythinig + can + be + here + paste(1, 2, 3)
    

    We can check what is the class of this object and we get:

    > class(f)
    [1] formula
    

    Then, rename_with() is a function. It needs to be passed at least 2 arguments: 1) a data.frame, 2) a function that renames the columns:

    > rename_with(iris, toupper)
    SEPAL.LENGTH SEPAL.WIDTH PETAL.LENGTH PETAL.WIDTH SPECIES
    1          5.1         3.5          1.4         0.2  setosa
    2          4.9         3.0          1.4         0.2  setosa
    3          4.7         3.2          1.3         0.2  setosa
    4          4.6         3.1          1.5         0.2  setosa
    5          5.0         3.6          1.4         0.2  setosa
    6          5.4         3.9          1.7         0.4  setosa
    

    In R functions can choose what to do with their arguments. We can try to inspect what this particular function does:

    > rename_with
    function (.data, .fn, .cols = everything(), ...) 
    {
        UseMethod("rename_with")
    }
    <bytecode: 0x7faad65f8510>
    <environment: namespace:dplyr>
    

    Not that useful, it tells us that this function is a generic function, the real body of it is hidden in some other place. We can find it here:

    > dplyr:::rename_with.data.frame
    function (.data, .fn, .cols = everything(), ...) 
    {
        .fn <- as_function(.fn)
        cols <- tidyselect::eval_select(enquo(.cols), .data, allow_rename = FALSE)
        names <- names(.data)
    ...
    

    Now we see that the first step it does to the second argument (.fn) is transform it into a function with as_function(.fn).

    This as_function() function is in another tidyverse package - "rlang" and we can find it there:

    > rlang::as_function
    function (x, env = global_env(), ..., arg = caller_arg(x), call = caller_env()) 
    {
        check_dots_empty0(...)
        if (is_function(x)) {
            return(x)
        }
        if (is_quosure(x)) {
            mask <- eval_tidy(call2(environment), env = quo_get_env(x))
            fn <- new_function(pairlist2(... = ), quo_get_expr(x), mask)
            return(fn)
        }
        if (is_formula(x)) {
            if (length(x) > 2) {
    ...
    

    The relevant part for us is this is_formula call in the end of the part of output I showed. So basically rename_with has some clever(?) way to turn a formula into a function. But you can achieve the same thing by passing your own function:

    iris |> rename_with(function(x) paste0(x, ")"))
    
      Sepal.Length) Sepal.Width) Petal.Length) Petal.Width) Species)
    1           5.1          3.5           1.4          0.2   setosa
    2           4.9          3.0           1.4          0.2   setosa
    3           4.7          3.2           1.3          0.2   setosa
    4           4.6          3.1           1.5          0.2   setosa
    5           5.0          3.6           1.4          0.2   setosa
    6           5.4          3.9           1.7          0.4   setosa
    

    The approach with tilde (creating a formula that is then turned into a function within rename_with) is just some syntax sugar. Under the hood the code then turns this formula into a function.

    And your first approach:

    betterLifeIndicators <- betterLifeIndicators |>   
    rename_with(paste0(.x, ")"))
    

    didn't work, because in your call the argument paste0(.x, ")") is not yet in a final form - it needs to be evaluated, R interpreter tries to evaluate this statement first but fails to find .x and shows you an error.