Consider this simple example
library(dplyr)
dataframe <- data_frame(id = c(1,2,3,4),
group = c('a','b','c','c'),
value = c(200,400,120,300))
> dataframe
# A tibble: 4 x 3
id group value
<dbl> <chr> <dbl>
1 1 a 200
2 2 b 400
3 3 c 120
4 4 c 300
and this tidyeval function that uses dplyr
to aggregate my dataframe according to some input column.
func_tidy <- function(data, mygroup){
quo_var <- enquo(mygroup)
df_agg <- data %>%
group_by(!!quo_var) %>%
summarize(mean = mean(value, na.rm = TRUE),
count = n()) %>%
ungroup()
df_agg
}
now, this works
> func_tidy(dataframe, group)
# A tibble: 3 x 3
group mean count
<chr> <dbl> <int>
1 a 200 1
2 b 400 1
3 c 210 2
but doing the same thing from within a loop FAILS
for(col in c(group)){
func_tidy(dataframe, col)
}
Error in grouped_df_impl(data, unname(vars), drop) : Column `col` is unknown
What is the problem here? How can I use my tidyeval
function in a loop?
Thanks!
For looping through column names you will need to use character strings.
for(col in "group")
When you pass this variable to your function, you will need to convert it from a character string to a symbol using rlang::sym
. You use !!
to unquote so the expression is evaluated.
So your loop would look like (I add a print
to see the output):
for(col in "group"){
print( func_tidy(dataframe, !! rlang::sym(col) ) )
}
# A tibble: 3 x 3
group mean count
<chr> <dbl> <int>
1 a 200 1
2 b 400 1
3 c 210 2