Search code examples
rfunctionvariablesrenamelapply

Rename Variables within lapply fun function


I am not able to get rename from dplyr to use the a name derived from the input variables when the corrosponding function is called by lapply. It always gives only the placeholder X[[i]]. Outside lapply it works just fine. I don't know what I am missing here. Any suggestions would be highly appriciated.

Working Example:

library(dplyr)
fc <- c(1:10)
sc <- c(20:29)
df1 <- cbind.data.frame(fc, sc) #data.frame for function within labbly
df2 <- cbind.data.frame(fc, sc) #data.frame for function without labbly

addfunc <- function(var, df) {
  attach(df)
  dfname = deparse(substitute(df)) #getting df name
  new_var_name = deparse(substitute(var)) #getting var name as string
  new_var_name <- paste(new_var_name, "_new", sep = "") #appending "_new" to var name string
  df <- df %>% mutate(new_var = var + 100) # performing some mutation on var and passing to place holder "new_var"
  df <- df %>% rename(!!new_var_name := new_var) # renaming placeholder "new_var" to "new_var_name"
  assign(dfname, df, envir=globalenv()) # pushing df to global environment
  detach(df)
  }

var <- list(sc, fc) #list with variable from df for lapply

#using function without lapply provides desired result
addfunc(fc, df2)
addfunc(sc, df2)

lapply(var, FUN = addfunc, df=df1) # using function within lapply does not get the new_var_name

I tried different kinds of conversion for the input to get its name: new_var_name = quote(var), new_var_name = as.character(var). But no success.


Solution

  • To get a better understanding of the fancy tidyverse solutions, I can recommend the programming with dplyr vignette.

    In the following, I provide two different solutions and have changed the code a bit how I would do it.

    • the functions don't have any side effects, they just return the changed data.frame
    • addfunc_1 takes a symbol as input for var (i.e. the variable name without quotation marks). In the tidyverse speak, this is a "data-variable in a function argument". Therefore, you need {{}} to use the actual value in dplyr functions.
    • addfunc_2 takes a string as input for var (i.e. the variable name with quotation marks). In the tidyverse speak, this is a "env-variable that is a character vector". Therefore, you need to use .data to index by name the respective column in dplyr functions.
    • if you just use lapply with list(sc, fc) without first initialising the variables (I changed how to create the data.frame and therefore haven't), it gives an error because it doesn't find these variables. You can use rlang::sym to circumvent this problem.
    • note that if you first initialise sc and fc as you did in your example, lapply doesn't give an error, but it uses the data stored in these variables and passes it to the addfunc function, leading to unwanted behaviour
    library(dplyr)
    
    df1 <- data.frame(
      fc = 1:10,
      sc = 20:29
    )
    
    # use symbols for var
    addfunc_1 <- function(var, df) {
      df %>% 
        mutate("{{var}}_new" := {{var}} + 100)
    }
    
    # use strings for var
    addfunc_2 <- function(var, df) {
      df %>% 
        mutate("{var}_new" := .data[[var]] + 100)
    }
    
    lapply(list(rlang::sym("sc"), rlang::sym("fc")), addfunc_1, df1)
    #> [[1]]
    #>    fc sc sc_new
    #> 1   1 20    120
    #> 2   2 21    121
    #> 3   3 22    122
    #> 4   4 23    123
    #> 5   5 24    124
    #> 6   6 25    125
    #> 7   7 26    126
    #> 8   8 27    127
    #> 9   9 28    128
    #> 10 10 29    129
    #> 
    #> [[2]]
    #>    fc sc fc_new
    #> 1   1 20    101
    #> 2   2 21    102
    #> 3   3 22    103
    #> 4   4 23    104
    #> 5   5 24    105
    #> 6   6 25    106
    #> 7   7 26    107
    #> 8   8 27    108
    #> 9   9 28    109
    #> 10 10 29    110
    
    lapply(list("sc", "fc"), addfunc_2, df1)
    #> [[1]]
    #>    fc sc sc_new
    #> 1   1 20    120
    #> 2   2 21    121
    #> 3   3 22    122
    #> 4   4 23    123
    #> 5   5 24    124
    #> 6   6 25    125
    #> 7   7 26    126
    #> 8   8 27    127
    #> 9   9 28    128
    #> 10 10 29    129
    #> 
    #> [[2]]
    #>    fc sc fc_new
    #> 1   1 20    101
    #> 2   2 21    102
    #> 3   3 22    103
    #> 4   4 23    104
    #> 5   5 24    105
    #> 6   6 25    106
    #> 7   7 26    107
    #> 8   8 27    108
    #> 9   9 28    109
    #> 10 10 29    110
    

    Created on 2023-02-07 by the reprex package (v1.0.0)

    Edit

    For your usecase, it's probably the easiest to just use across with mutate as you can directly there specify how the new columns should be generated:

    df1 %>% 
      mutate(across(c(fc, sc), ~.x + 100, .names = "{.col}_new"))