I'm trying to put together a function that creates a subset from my original data frame, and then uses dplyr's SELECT and MUTATE to give me the number of large/small entries, based on the sum of the width and length of sepals/petals.
filter <- function (spp, LENGTH, WIDTH) {
d <- subset (iris, subset=iris$Species == spp) # This part seems to work just fine
large <- d %>%
select (LENGTH, WIDTH) %>% # This is where the problem arises.
mutate (sum = LENGTH + WIDTH)
big_samples <- which(large$sum > 4)
return (length(big_samples))
}
Basically, I want the function to return the number of large flowers. However, when I run the function I get the following error -
filter("virginica", "Sepal.Length", "Sepal.Width")
Error: All select() inputs must resolve to integer column positions.
The following do not:
* LENGTH
* WIDTH
What am I doing wrong?
UPDATE: As of dplyr 0.7.0 you can use tidy eval to accomplish this.
See http://dplyr.tidyverse.org/articles/programming.html for more details.
filter_big <- function(spp, LENGTH, WIDTH) {
LENGTH <- enquo(LENGTH) # Create quosure
WIDTH <- enquo(WIDTH) # Create quosure
iris %>%
filter(Species == spp) %>%
select(!!LENGTH, !!WIDTH) %>% # Use !! to unquote the quosure
mutate(sum = (!!LENGTH) + (!!WIDTH)) %>% # Use !! to unquote the quosure
filter(sum > 4) %>%
nrow()
}
filter_big("virginica", Sepal.Length, Sepal.Width)
> filter_big("virginica", Sepal.Length, Sepal.Width)
[1] 50