Search code examples
rfunctiondplyrnse

dplyr NSE mode in a function: nested conditions


The objective is to transform column(s) of a data frame. Here is the example:

  df <- data.frame( fact=c("dog",2,"NA",0,"cat",1,"Cat"),
              value=c(4,2,6,0,9,1,3) ); df$fact <- as.factor(df$fac)

  func <- function(data,fac,val){
          data <- data %>%  
          mutate_(fac= interp(~tolower(fac), fac=as.name(fac)) ) %>%
          mutate_(val= interp(~ifelse(fac=='cat',1*val,
                       ifelse(fac=='dog',2*val,0)), fac=as.name(fac), val=as.name(val)))
  return(data) } 

The call:

new.df <- func(df,"fact","value")

     fact value  fac val
   1  dog     4  dog  8
   2    2     2   2   0
   3   NA     6  na   0
   4    0     0   0   0
   5  cat     9 cat   9
   6    1     1   1   0
   7  Cat     3 cat   0

presents 2 issues: (1)- the value associated to "Cat" is false; should be 3*1=3 (2)- the call ideally returns the original data.frame df with the transformed fact and value variables.

Any thoughts? Thank you guys.

Edit: please note that df has another column third which should be left unaffected by the operations done to fact and value.


Solution

  • If you want to return the original columns (potentially including other columns in your data.frame) with the original names, you can use a slightly different dplyr-approach with mutate_each instead of mutate:

    library(lazyeval)
    library(dplyr)
    
    func <- function(data,fac,val) {
      data %>%  
        mutate_each_(interp(~tolower(var), var = as.name(fac)), fac) %>% 
        mutate_each_(interp(~ifelse(col =='cat', var, ifelse(col == 'dog',2*var, 0)), 
                 var=as.name(val), col = as.name(fac)), val)
    }
    

    Using the function:

    func(df, "fact", "value")
    #  fact value
    #1  dog     8
    #2    2     0
    #3   na     0
    #4    0     0
    #5  cat     9
    #6    1     0
    #7  cat     3
    

    The difference to akruns answer is demonstrated if you have other columns in your data that you like to keep in it (whereas they would be removed with akrun's approach because of the transmute):

    df$some_column <- letters[1:7]  # add a new column
    

    Other columns now remain in your data after using the function and the modified columns keep their original names:

    func(df, "fact", "value")
    #  fact value some_column
    #1  dog     8           a
    #2    2     0           b
    #3   na     0           c
    #4    0     0           d
    #5  cat     9           e
    #6    1     0           f
    #7  cat     3           g