Search code examples
rdplyrtidyeval

Using multiple defused function arguments in dplyr::mutate_at


I have a simple function that maps NAs of a column of interest (here var_to_set_1) to something more specific when the variable var_x has the value "Unassigned".

The function in its current state.

library(dplyr)
dummy_data <- tribble(
  ~var_x,       ~var_to_set_1,  ~var_to_set_2,
  "A",          "Type_A",       "Subtype_A",
  "Unassigned", NA_character_,  NA_character_
)

test_fun <- function(data_in, var_to_set){

  var_to_set <- enquo(var_to_set)
  
  data_out <- data_in %>%
    mutate(!!var_to_set := if_else((var_x == "Unassigned" & is.na(!!var_to_set)),
                                    true = "Type_Unassigned",
                                    false = !!var_to_set))
  return(data_out)
}

dummy_data %>% test_fun(var_to_set_1)

So far this works well, but now I want to extend this function so that it can map NA values from two or more variables of interest. Again, the assignment should depend on the variable var_x as explained above. Obviously I could call the function twice for the different var_to_set, but I want to take advantage of tidy evaluation.

Following a similar logic, I have now tried the plural version of enquo in combination with elipsis and mutate_at to change the columns var_to_set_1 and var_to_set_2 in one step. However, I quickly realised that the seemingly simple extension to several variables was much more difficult than I thought. I began to venture into uncharted territory and some questions emerged that I was unable to answer myself.

  • How should the defused function arguments be passed to mutate_at? Using the snippet below with the !!! operator I get following error:Error in !var_to_set : invalid argument type.
  • Regarding the inner function within mutate_at. How should defused and non-defused variables be passed to the inner function? Is the quosure property inherited?
  • Is the whole approach flawed like that?

Here is a snippet of what I tried to do.

test_fun_2 <- function(data_in, ...){
  
  var_to_set <- enquos(...)
  
  data_out <- data_in %>%
    mutate_at(.vars = !!!var_to_set,
              .funs = ~{
                if_else((var_x == "Unassigned" & is.na(.)),
                        true = "Type_Unassigned",
                        false = .)}
              )

  return(data_out)
}

dummy_data %>% test_fun_2(var_to_set_1, var_to_set_2)

The expected output should look like this:

expected_data <- tribble(
  ~var_x,       ~var_to_set_1,     ~var_to_set_2,
  "A",          "Type_A",          "Subtype_A",
  "Unassigned", "Type_Unassigned", "Type_Unassigned"
)

Btw, I use version 1.0.9 of dplyr.


Solution

  • You want across:

    test_fun <- function(data_in, ...){
      data_in %>%
        mutate(across(c(...), 
                      .fns = ~ if_else((var_x == "Unassigned" & is.na(.x)),
                                        true = "Type_Unassigned",
                                        false = .x)
                     )
               )
    }
    
    dummy_data %>% test_fun(var_to_set_1, var_to_set_2)
    # # A tibble: 2 x 3
    #   var_x      var_to_set_1    var_to_set_2   
    #   <chr>      <chr>           <chr>          
    # 1 A          Type_A          Subtype_A      
    # 2 Unassigned Type_Unassigned Type_Unassigned
    

    Update. In an earlier version I used enquos but as @Lionel was mentioning in the comments, this is not really necessary.