Search code examples
rggplot2tidymodels

Graphing using list of dataframes


Following up on this post here

I use tidymodels to fit a regression to a grouped list of dataframes. Then I predict values into another list of dataframes.

# Code from the original question
library(dplyr)

year <- rep(2014:2018, length.out=10000)
group <- sample(c(0,1,2,3,4,5,6), replace=TRUE, size=10000)
value <- sample(10000, replace=T)
female <- sample(c(0,1), replace=TRUE, size=10000)
smoker <- sample(c(0,1), replace=TRUE, size=10000)
dta <- data.frame(year=year, group=group, value=value, female=female, smoker=smoker)

# cut the dataset into list
table_list <- dta %>%
  group_by(year, group) %>%
  group_split()

# fit model per subgroup
model_list <- lapply(table_list, function(x) glm(smoker ~ female, data=x,
                                                 family=binomial(link="probit")))

# create new dataset where female =1
dat_new <- data.frame(dta[, c("smoker", "year", "group")], female=1) 

# predict
pred1 <- lapply(model_list, function(x) predict.glm(x, type = "response"))

Now, I would like to use that list of predicted variables to graph the groups using ggplot and facet_wrap. This last step is where I get lost. How do I graph using a list of dataframes? This doesn't run.

ggplot(pred1)+
  geom_point(aes(x=year, y=value))+
  facet_wrap(~ group)

Solution

  • Follow the same pattern as in your previous question - use lapply.

    This is made easier by defining a function to perform the plotting.

    my_plot_function <- function(x) {
      ggplot(data = x) +
        geom_point(aes(x=year, y=value))+
        facet_wrap(~ group)
    }
    
    lapply(pred1, my_plot_function)
    

    Note that this will return a list of plots, accessible by normal indexing.