I have a dataframe that looks like this
df = data.frame(id = 1:10, wt = 71:80, gender = rep(1:2, 5), race = rep(1:2, 5))
I'm trying to write a function that takes on a dataframe as a first argument together with any number of arguments that represent column names in that dataframe and use these column names to perform operations on the dataframe. My function would look like this:
library(dplyr)
myFunction <- function(df, ...){
columns <- list(...)
for (i in 1:length(columns)){
var <- enquo(columns[[i]])
df <- df %>% group_by(!!var)
}
df2 = summarise(df, mean = mean(wt))
return(df2)
}
I call the function as the following
myFunction(df, race, gender)
However, I get the following error message:
Error in myFunction(df, race, gender) : object 'race' not found
We can convert the elements in ...
to quosures and then do the evaluation (!!!
)
myFunction <- function(dat, ...){
columns <- quos(...) # convert to quosures
dat %>%
group_by(!!! columns) %>% # evaluate
summarise(mean = mean(wt))
}
myFunction(df, race, gender)
# A tibble: 2 x 3
# Groups: race [?]
# race gender mean
# <int> <int> <dbl>
#1 1 1 75
#2 2 2 76
myFunction(df, race)
# A tibble: 2 x 2
# race mean
# <int> <dbl>
#1 1 75
#2 2 76
NOTE: In the OP's example, 'race' and 'gender' are the same
If it change it, will see the difference
df <- data.frame(id = 1:10, wt = 71:80, gender = rep(1:2, 5),
race = rep(1:2, each = 5))
myFunction(df, race, gender)
myFunction(df, race)
myFunction(df, gender)
If we decide to pass the arguments as quoted strings, then we can make use of group_by_at
myFunction2 <- function(df, ...) {
columns <- c(...)
df %>%
group_by_at(columns) %>%
summarise(mean= mean(wt))
}
myFunction2(df, "race", "gender")