Search code examples
rdplyrnse

Extract column name in mutate_if call


I would like to extract the column name in the function call to mutate_if. With this, I then want to look up a value in a different table and fill in missing values with the lookup value. I tried using quosure syntax, but it is not working. Is there a possibility to extract the column name directly?

Sample Data

df <- structure(list(x = 1:10, 
               y = c(1L, 2L, 3L, NA, 1L, 2L, 3L, NA, 1L, 2L), 
               z = c(NA, 2L, 3L, NA, NA, 2L, 3L, NA, NA, 2L), 
               a = c("a", "b", "c", "d", "e", "a", "b", "c", "d", "e")), 
          .Names = c("x", "y", "z", "a"), 
          row.names = c(NA, -10L), 
          class = c("tbl_df", "tbl", "data.frame"))
df_lookup <- tibble(x = 0L, y = 5L, z = 8L)

Not working

It does not work to extract the name somehow directly.

df %>% 
  mutate_if(is.numeric, funs({
    x <- .
    x <- enquo(x)
    lookup_value <- df_lookup %>% pull(quo_name(x))
    x <- ifelse(is.na(x), lookup_value, x)
    return(x)
  }))

With an extra function I'm able to extract the name but then the replacement doesn't work anymore.

custom_mutate <- function(v) {
  v <- enquo(v)
  lookup_value <- df_lookup %>% pull(quo_name(v))

  # ifelse(is.na((!!v)), lookup_value, (!!v))
}

df %>% 
  mutate_if(is.numeric, funs(custom_mutate(v = .)))

Works

If I add the df as an additional argument to my custom function it works, but is there a way without this? It feels wrong and not how dplyr is meant to be... Correct me if I'm wrong ;)
In addition to this I have to use UQE instead of !! and as it says in Programming with dplyr:

UQE() is for expert use only

custom_mutate2 <- function(v, df) {
  v <- enquo(v)
  lookup_value <- df_lookup %>% pull(quo_name(v))

  df %>% 
    mutate(UQE(v) := ifelse(is.na((!!v)), lookup_value, (!!v))) %>% 
    pull(!!v)
}

df %>% 
  mutate_if(is.numeric, funs(custom_mutate2(v = ., df = df)))

Expected output

# A tibble: 10 x 4
#        x     y     z a    
#    <int> <int> <int> <chr>
#  1     1     1     8 a    
#  2     2     2     2 b    
#  3     3     3     3 c    
#  4     4     5     8 d    
#  5     5     1     8 e    
#  6     6     2     2 a    
#  7     7     3     3 b    
#  8     8     5     8 c    
#  9     9     1     8 d    
# 10    10     2     2 e   

Solution

  • You have to use quo instead of enquo

    #enquo(.) :
    <quosure: empty>
    ~function (expr) 
    {
        enexpr(expr)
    }
    ...
    
    #quo(.) :
    <quosure: frame>
    ~x
    <quosure: frame>
    ~y
    <quosure: frame>
    ~z
    

    With your example :

    mutate_if(df, is.numeric, funs({
      lookup_value <- df_lookup %>% pull(quo_name(quo(.)))
      ifelse(is.na(.), lookup_value, .)
    }))
    
    # A tibble: 10 x 4
           x     y     z a    
       <int> <int> <int> <chr>
     1     1     1     8 a    
     2     2     2     2 b    
     3     3     3     3 c    
     4     4     5     8 d    
     5     5     1     8 e    
     6     6     2     2 a    
     7     7     3     3 b    
     8     8     5     8 c    
     9     9     1     8 d    
    10    10     2     2 e