Search code examples
rgetgroup-bydplyrmutated

When trying to call an object with get() within group_by and mutate, it brings up the entire object and not the grouped object. How do I fix this?


Here is my code:

data(iris)
spec<-names(iris[1:4])
iris$Size<-factor(ifelse(iris$Sepal.Length>5,"A","B"))
for(i in spec){
  attach(iris)
  output<-iris %>%
    group_by(Size)%>%
    mutate(
  out=mean(get(i)))
  detach(iris)
}

The for loop is written around some graphing and report writing that uses object 'i' in various parts. I am using dplyr and plyr.

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species Size      out
1           5.1         3.5          1.4         0.2  setosa    A 1.199333
2           4.9         3.0          1.4         0.2  setosa    B 1.199333
3           4.7         3.2          1.3         0.2  setosa    B 1.199333
4           4.6         3.1          1.5         0.2  setosa    B 1.199333
5           5.0         3.6          1.4         0.2  setosa    B 1.199333

Notice how that variable 'out' has the same mean, which is the mean of the entire dataset instead of the grouped mean.

> tapply(iris$Petal.Width,iris$Size,mean)
       A        B 
1.432203 0.340625 
> mean(iris$Petal.Width)
[1] 1.199333

Solution

  • Using get() and attach() isn't really consistent with dplyr because it's really messing up the environments in which the functions are evaulated. It would better to use the standard-evaluation equivalent of mutate here as described in the NSE vigette (vignette("nse", package="dplyr"))

    for(i in spec){
      output<-iris %>%
        group_by(Size)%>%
        mutate_(.dots=list(out=lazyeval::interp(~mean(x), x=as.name(i))))
        # print(output)
    }