Search code examples
rtidyrstandard-evaluation

Standard evaluation for tidyr::complete - a function that completes by all non-numeric columns


I want to make a function that would apply tidyr::complete to all non-numeric columns of an R data.frame. Value zero should be inserted to the new value rows. I understand that this requires standard evaluation solution, but I've thus far had no success.

Here is what I have thus far:

completeDf <- function(df){

      vars <- names(df)

      chVars <- vars[!(sapply(df, is.numeric))]
      nmVars <- vars[!(vars %in% chVars)]

      quoChVars <- quos(chVars)

      nmList <- vector("list", length(nmVars))
      nmList <- setNames(lapply(nmList, function(x) x <- 0), nmVars)
      quoNmVars <- quos(nmList)

      df <- df %>%
            complete(!!!quoChVars, fill = !!!quoNmVars)
}

Any idea of how to make this work?


Solution

  • 1) rlang/tidyreval Use !!!syms(notnum_names) to insert the variable names as complete arguments. Fill is just an ordinary list and no rlang/tidyeval computations are needed for it.

    library(dplyr)
    library(tidyr)
    library(rlang)
    
    completeDF <- function(data) {
      is_num <- sapply(data, is.numeric)
      num_names <- names(data)[ is_num ]
      notnum_names <- names(data)[ !is_num ]
      fill <- Map(function(x) 0, num_names)
      data %>% complete(!!!syms(notnum_names), fill = fill)
    }
    
    DF <- data.frame(a = c("A", "B", "B"), b = c("a", "a", "b"), c = 1:3) # test data
    completeDF(DF)
    

    giving:

    # A tibble: 4 x 3
           a      b     c
      <fctr> <fctr> <dbl>
    1      A      a     1
    2      A      b     0
    3      B      a     2
    4      B      b     3
    

    Here is the original code from the question modified to make it work. The changed lines are marked with ## at the end of each.

    completeDf <- function(df){
    
          vars <- names(df)
    
          chVars <- vars[!(sapply(df, is.numeric))]
          nmVars <- vars[!(vars %in% chVars)]
    
          symsChVars <- rlang::syms(chVars) ##
    
          nmList <- vector("list", length(nmVars))
          nmList <- setNames(lapply(nmList, function(x) 0), nmVars) ##
          # quoNmVars <- quos(nmList ##
    
          df %>% ##
                complete(!!!symsChVars, fill = nmList) ##
    }
    
    completeDf(DF)
    

    2) wrapr An alternative to rlang/tidyeval is the wrapr package.

    The code here is the same as in (1) except we use library(wrapr) instead of library(rlang) and the last line of completeDF is replaced with a let statement giving completeDF2.

    library(dplyr)
    library(tidyr)
    library(wrapr)
    
    completeDF2 <- function(data) {
      is_num <- sapply(data, is.numeric)
      num_names <- names(data)[ is_num ]
      notnum_names <- names(data)[ !is_num ]
      fill <- Map(function(x) 0, num_names)
      let(c(NOTNUM = toString(notnum_names)), 
          data %>% complete(NOTNUM, fill = fill),
          strict = FALSE,
          subsMethod = "stringsubs")
    }
    
    completeDF2(DF)
    

    Updates: Fixes and improvements. Add wrapr approach.