Search code examples
rfor-loopplotfilterassign

R - how to filter data with a list of arguments to produce multiple data frames and graphs


I am looking for a way to use a list of filter arguments to produce different objects. I have a data set for which I want to make several graphs. However, I want all these graphs based on subsets of the dataset. For illustrative purposes I have made the following data.

df <- data.frame(type = c("b1", "b2", "b1", "b2"),
                 yield = c("15", "10", "5", "0"),
                 temperature = c("2", "21", "26", "13"),
                 Season = c("Winter", "Summer", "Summer", "Autumn"),
                 profit = c(TRUE, TRUE, FALSE, FALSE))

Also, I have a list of filter arguments.

filters <- c("brand=='b1'",
             "profit",
             "Season=='Summer'",
             "profit==FALSE",
             "yield >= 10",
             "")

What I would want is that I could use a for loop to have all these filters produce objects with the filtered data, and subsequently plot graphs. I have tried this in the following way.

for(i in 1:length(filters)){
  assign(paste0("df", i), filter(df, factor(filters[i])))
  assign(paste0("plot", i), ggplot(database, aes(x = temperature, y = yield)) + geom_point())
}

However, this did not work because the filter() function does not accept <fct> as an argument, nor <chr> (e.g., "brand=='b1'"). What I would want is brand=='b1', so filter() accepts it as an argument. Does anybody have an idea to do this?

Also, as an additional question, I would like to automate the whole process and end with an combined graph, so grid.arrange() at the end. Of course I could automate the ncol and nrow with some devision of length(filters). But how to I get all the produced plots in the grid.arrange()? This should probably be outside the for loop, right? Any ideas here?


Solution

  • You can do it by using eval and parse.

    Also, a lapply over a custom function sounds more reasonable than a for loop with assign. The result is a list of ggplot objects.

    To set all charts all together grid.arrange from the gridExtra package works fine. You just need to assign the list of your charts to the argument called grobs.

    library(dplyr)
    library(ggplot2)
    
    df <- data.frame(type = c("b1", "b2", "b1", "b2"),
                     yield = c(15, 10, 5, 0),
                     temperature = c("2", "21", "26", "13"),
                     Season = c("Winter", "Summer", "Summer", "Autumn"),
                     profit = c(TRUE, TRUE, FALSE, FALSE))
    
    filters <- list("type=='b1'",
                    "profit",
                    "Season=='Summer'",
                    "profit==FALSE",
                    "yield >= 10",
                    "TRUE")
    
    
    myfun <- function(fltr, df){
    
      df <- filter(df, eval(parse(text = fltr)))
      ggplot(df, aes(x = temperature, y = yield)) + geom_point()
    
    }
    
    
    ggs <- lapply(filters, myfun, df = df)
    
    gridExtra::grid.arrange(grobs = ggs)
    
    

    enter image description here

    I made a couple of changes in your data: yield must be a numeric since you are using a filter applicable only to numeric vectors and the last filter (which was empty) is now equal to "TRUE" [I supposed you wanted to take everything in consideration]