I found this very helpful article on how to write a function accepting variable arguments using quosure and tidy dots. Here's some of the code:
my.summary <- function(df.name=df_tp1, group_var, ...) {
group_var <- enquo(group_var)
smry_vars <- enquos(..., .named = TRUE)
the.mean <- purrr::map(smry_vars, function(var) {
expr(mean(!!var, na.rm = TRUE))
})
names(the.mean) <- paste0("mean-", names(the.mean))
df.name %>%
group_by(!!group_var) %>%
summarise(!!!the.mean)
}
The problem is I have to call the function with a long string of variables, like this:
cm_all1 <- my.summary(df_tp1_cm, group_var=net_role, so_part_value, cult_ci, cult_sn, cult_ebc, sl_t_lrn, sl_xt_lrn, nl_netops_km, so_rt, nl_netops_trust)
I would be very happy to be able to just call it with something like
so_part_value:nl_netops_trust
instead, but this gives errors like this:
Error in so_part_value:nl_netops_trust : NA/NaN argument
I also tried putting the variable names in a character vector and then using enquo() and !! but that didn't work.
Here is my rewrite of the function using Yifu's ideas. This works for my fake data set but not the real data.
my.summary <- function(df.name=df_tp1, group_var, ...) {
## group_var <- enquo(group_var)
smry_vars <- df.name %>% select(...) %>% colnames()
df.name %>%
## group_by(!!group_var) %>%
group_by({{group_var}}) %>%
summarise_at(smry_vars,
list(mean=function(x) mean(x, na.rm=TRUE),
sd=function(x) sd(x, na.rm=TRUE),
min=function(x) min(x, na.rm=TRUE),
max=function(x) max(x, na.rm=TRUE),
q1=function(x) quantile(x, .25, na.rm=TRUE),
q2=function(x) quantile(x, .50, na.rm=TRUE),
q3=function(x) quantile(x, .75, na.rm=TRUE),
n=function(x) n()
))
}
You just need to make sure ...
is in the correct environment(the df
you provided in this example). And then you can use colnames()
to extract the column name.
library(rlang)
get_column_range <- function(df,...){
writeLines("Column names as string:")
print(df %>% select(...) %>% colnames())
writeLines("Convert back to symbols")
print(syms(df %>% select(...) %>% colnames()))
}
get_column_range(df = iris,Sepal.Length:Petal.Width)
Column names as string:
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
Convert to symbol
[[1]]
Sepal.Length
[[2]]
Sepal.Width
[[3]]
Petal.Length
[[4]]
Petal.Width
And dplyr
functions with _at
suffix also accept string as variable, you don't have to convert them to quosure and then unquote them.
Note that {{}}
is a easier syntax to learn, it quotes and unquotes at the same time:
my.summary <- function(df,group_var,...){
column_names <- df %>% select(...) %>% colnames()
df %>%
group_by({{group_var}}) %>%
summarise_at(column_names,list(mean = mean))
}
my.summary(df = iris,group_var = Species,Sepal.Length:Petal.Width)
# A tibble: 3 x 5
Species Sepal.Length_mean Sepal.Width_mean Petal.Length_mean Petal.Width_mean
<fct> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.01 3.43 1.46 0.246
2 versicolor 5.94 2.77 4.26 1.33
3 virginica 6.59 2.97 5.55 2.03
For more, you can read at: https://rlang.r-lib.org/reference/quotation.html