I am researching how to use R function on line but still have hard time figuring out. Please help.
My initial code looks like:
whatever %>%
group_by(a) %>%
summarize(count=n()) %>%
collect() %>%
ggplot(aes(x=a, y=count)) +
geom_point()
I want to repeat this multiple times since there are other columns I want to check with the same function.
So I wrote:
point_dist <- function(dta, vari) {
dta %>%
group_by(vari) %>%
summarize(count=n()) %>%
collect() %>%
ggplot(aes(x=vari, y=count)) +
gemo_point()
}
point_dist(whatever, a)
but keep telling me:
Error in eval_bare(sym, env) : object 'a' not found
Don't know why.
I either don't know if this is the right direction I shall go.
Thanks again.
Your issue is related to non-standard evaluation that dplyr
functions tend to give you. When you reference a
in your first call to point_dist
, R attempts to evaluate it, which of course fails. (It's even more confusing when you have some variable named as such in your calling environment or higher ...)
NSE in dplyr
means you can do something like select(mtcars, cyl)
, whereas with most standard-evaluation functions, you'll need myfunc(mtcars, "cyl")
, since there isn't a variable named cyl
in the calling environment.
In your case, try:
point_dist <- function(dta, vari) {
vari <- enquo(vari)
dta %>%
group_by(!!vari) %>%
summarize(count=n()) %>%
collect() %>%
ggplot(aes(x=!!vari, y=count)) +
gemo_point()
}
This method of dealing with unquoted column-names in your functions can be confusing if you're familiar with normal R function definitions and/or are not familiar with NSE. This can be a good template for you if that's as far as you're going to go with it, otherwise I strongly urge you to read a little more at the first reference below.
Some good references for NSE, specifically in/around tidyverse stuff: