I have a custom function which summarises a variable. I simplified the function to illustrate my problem, i.e. it is more complex than shown below. Note that the general structure of the function should remain the same: It takes an argument for specifying which dataframe to work on (df
), and an argument which variable to summarise (variable_to_test
).
my_fun <- function(df, variable_to_test) {
variable_to_test <- enquo(variable_to_test)
new_var_name <- paste0(quo_name(variable_to_test), "_new_name")
df %>%
summarise(
!!new_var_name := sum(!!variable_to_test, na.rm = TRUE)
)
}
Using an example, I can apply the function on each variable in my dataframe:
library(tidyverse)
dat <- tibble(
variable_1 = c(1:5, NA, NA, NA, NA, NA),
variable_2 = c(NA, NA, NA, NA, NA, 11:15)
)
> my_fun(dat, variable_1)
# A tibble: 1 x 1
variable_1_new_name
<int>
1 15
> my_fun(dat, variable_2)
# A tibble: 1 x 1
variable_2_new_name
<int>
1 65
But: how can I list apply the function on all columns in the dataframe? I tried
> dat %>%
+ lapply(., my_fun)
Error in duplicate(quo) : argument "quo" is missing, with no default
Called from: duplicate(quo)
but this returns an error. I'm struggling with the fact that the function takes an argument for both the dataframe to work on, and the variable to summarise. Note that I'd like to keep this structure - I find it more elegant to pass the name of the dataframe into the function instead of just giving the function the variable name and "hard-code" the data frame into the function body. Does anybody have a good idea how to lapply()
the function?
Oh, I think you're just mapping over the wrong thing. For the tidyverse solution I would try:
map(dat, ~my_fun(dat, .))
What this does is map over the column names and plug in the column to the .
.