I am not able to get rename
from dplyr
to use the a name derived from the input variables when the corrosponding function is called by lapply
. It always gives only the placeholder X[[i]]
. Outside lapply
it works just fine. I don't know what I am missing here. Any suggestions would be highly appriciated.
Working Example:
library(dplyr)
fc <- c(1:10)
sc <- c(20:29)
df1 <- cbind.data.frame(fc, sc) #data.frame for function within labbly
df2 <- cbind.data.frame(fc, sc) #data.frame for function without labbly
addfunc <- function(var, df) {
attach(df)
dfname = deparse(substitute(df)) #getting df name
new_var_name = deparse(substitute(var)) #getting var name as string
new_var_name <- paste(new_var_name, "_new", sep = "") #appending "_new" to var name string
df <- df %>% mutate(new_var = var + 100) # performing some mutation on var and passing to place holder "new_var"
df <- df %>% rename(!!new_var_name := new_var) # renaming placeholder "new_var" to "new_var_name"
assign(dfname, df, envir=globalenv()) # pushing df to global environment
detach(df)
}
var <- list(sc, fc) #list with variable from df for lapply
#using function without lapply provides desired result
addfunc(fc, df2)
addfunc(sc, df2)
lapply(var, FUN = addfunc, df=df1) # using function within lapply does not get the new_var_name
I tried different kinds of conversion for the input to get its name: new_var_name = quote(var)
, new_var_name = as.character(var)
. But no success.
To get a better understanding of the fancy tidyverse solutions, I can recommend the programming with dplyr
vignette.
In the following, I provide two different solutions and have changed the code a bit how I would do it.
addfunc_1
takes a symbol as input for var
(i.e. the variable name without quotation marks). In the tidyverse speak, this is a "data-variable in a function argument". Therefore, you need {{}}
to use the actual value in dplyr
functions.addfunc_2
takes a string as input for var
(i.e. the variable name with quotation marks). In the tidyverse speak, this is a "env-variable that is a character vector". Therefore, you need to use .data
to index by name the respective column in dplyr
functions.lapply
with list(sc, fc)
without first initialising the variables (I changed how to create the data.frame and therefore haven't), it gives an error because it doesn't find these variables. You can use rlang::sym
to circumvent this problem.sc
and fc
as you did in your example, lapply
doesn't give an error, but it uses the data stored in these variables and passes it to the addfunc
function, leading to unwanted behaviourlibrary(dplyr)
df1 <- data.frame(
fc = 1:10,
sc = 20:29
)
# use symbols for var
addfunc_1 <- function(var, df) {
df %>%
mutate("{{var}}_new" := {{var}} + 100)
}
# use strings for var
addfunc_2 <- function(var, df) {
df %>%
mutate("{var}_new" := .data[[var]] + 100)
}
lapply(list(rlang::sym("sc"), rlang::sym("fc")), addfunc_1, df1)
#> [[1]]
#> fc sc sc_new
#> 1 1 20 120
#> 2 2 21 121
#> 3 3 22 122
#> 4 4 23 123
#> 5 5 24 124
#> 6 6 25 125
#> 7 7 26 126
#> 8 8 27 127
#> 9 9 28 128
#> 10 10 29 129
#>
#> [[2]]
#> fc sc fc_new
#> 1 1 20 101
#> 2 2 21 102
#> 3 3 22 103
#> 4 4 23 104
#> 5 5 24 105
#> 6 6 25 106
#> 7 7 26 107
#> 8 8 27 108
#> 9 9 28 109
#> 10 10 29 110
lapply(list("sc", "fc"), addfunc_2, df1)
#> [[1]]
#> fc sc sc_new
#> 1 1 20 120
#> 2 2 21 121
#> 3 3 22 122
#> 4 4 23 123
#> 5 5 24 124
#> 6 6 25 125
#> 7 7 26 126
#> 8 8 27 127
#> 9 9 28 128
#> 10 10 29 129
#>
#> [[2]]
#> fc sc fc_new
#> 1 1 20 101
#> 2 2 21 102
#> 3 3 22 103
#> 4 4 23 104
#> 5 5 24 105
#> 6 6 25 106
#> 7 7 26 107
#> 8 8 27 108
#> 9 9 28 109
#> 10 10 29 110
Created on 2023-02-07 by the reprex package (v1.0.0)
For your usecase, it's probably the easiest to just use across
with mutate
as you can directly there specify how the new columns should be generated:
df1 %>%
mutate(across(c(fc, sc), ~.x + 100, .names = "{.col}_new"))