Search code examples
rfunctiondplyrtidyeval

Error when using dplyr inside of a function


I'm trying to put together a function that creates a subset from my original data frame, and then uses dplyr's SELECT and MUTATE to give me the number of large/small entries, based on the sum of the width and length of sepals/petals.

filter <- function (spp, LENGTH, WIDTH) {
  d <- subset (iris, subset=iris$Species == spp) # This part seems to work just fine
  large <- d %>%                       
    select (LENGTH, WIDTH) %>%   # This is where the problem arises.
    mutate (sum = LENGTH + WIDTH) 
  big_samples <- which(large$sum > 4)
 return (length(big_samples)) 
}

Basically, I want the function to return the number of large flowers. However, when I run the function I get the following error -

filter("virginica", "Sepal.Length", "Sepal.Width")

 Error: All select() inputs must resolve to integer column positions.
The following do not:
*  LENGTH
*  WIDTH 

What am I doing wrong?


Solution

  • UPDATE: As of dplyr 0.7.0 you can use tidy eval to accomplish this.

    See http://dplyr.tidyverse.org/articles/programming.html for more details.

    filter_big <- function(spp, LENGTH, WIDTH) {
      LENGTH <- enquo(LENGTH)                    # Create quosure
      WIDTH  <- enquo(WIDTH)                     # Create quosure
    
      iris %>% 
        filter(Species == spp) %>% 
        select(!!LENGTH, !!WIDTH) %>%            # Use !! to unquote the quosure
        mutate(sum = (!!LENGTH) + (!!WIDTH)) %>% # Use !! to unquote the quosure
        filter(sum > 4) %>% 
        nrow()
    }
    
    filter_big("virginica", Sepal.Length, Sepal.Width)
    
    > filter_big("virginica", Sepal.Length, Sepal.Width)
    [1] 50