Search code examples
rggplot2purrr

ggplot2 loop over two columns to create plots in groups


I have a dataset in which one of the variables is states and the other is districts (that are located in each state), and the remaining are numerical variables. There are 32 states, and 712 districts that are spread unevenly across the states. I want to create plots separately for each state such that all its constituent districts and their numerical variables are in the plot. Also, I want to automatically assign a title containing the state name, and then save each plot (i.e., state-wise) as a PDF into a designated folder.

The code below is based on some synthetic data:

library(tidyverse)

## generage data
set.seed(321)
statenames <- c("state1","state2","state1","state2","state1","state2")
distnames <- c("dist1","dist2","dist3","dist4","dist5","dist6")
hcrv <- rnorm(n=6,mean = 10, sd=2)
intv <- rnorm(n=6,mean = 100, sd=20)
rdiv <- rnorm(n=6,mean = 150, sd=2)

plotdata <- cbind.data.frame(statenames, distnames, hcrv,intv,rdiv)

## Use the purrr package to create state-wise plots
plotdata %>%
  pivot_longer(cols = -c(statenames, distnames), names_to =  "indicatorName",
               values_to = "indicatorValue") %>% 
    group_split(statenames) %>%
    purrr::map(ggplot(., aes(x=indicatorName, y=indicatorValue))+
            geom_col())

However, I get the error that Error in fortify(): ! data must be a <data.frame>, or an object coercible by fortify(), or a valid <data.frame>-like object coercible by as.data.frame(). Caused by error in .prevalidate_data_frame_like_object(): ! dim(data) must return an <integer> of length 2. Run rlang::last_trace() to see where the error occurred.

And, I could not figure out how to save automatically the plots as PDF in the desired folder. Any help is appreciated!

Essentially, in the dplyr parlance, I am filtering each state and then plotting district-wise indicators. For example,

## plot for state1
plotdata %>%
  pivot_longer(cols = -c(statenames, distnames), names_to =  "indicatorName",
               values_to = "indicatorValue") %>% 
    filter(statenames=="state1") %>%
    ggplot(aes(x=indicatorName, y=indicatorValue))+
    geom_col()+
    facet_wrap(~distnames)+
    labs(title = "state1")


## plot for state2
plotdata %>%
  pivot_longer(cols = -c(statenames, distnames), names_to =  "indicatorName",
               values_to = "indicatorValue") %>% 
    filter(statenames=="state2") %>%
    ggplot(aes(x=indicatorName, y=indicatorValue))+
    geom_col()+
    facet_wrap(~distnames)+
    labs(title = "state2")

Solution

  • Similar map workflow, but puts the title in plot and includes line to save as pdfs.

    DATA_LIST<- plotdata %>%
      pivot_longer(cols = -c(statenames, distnames), names_to =  "indicatorName",
                   values_to = "indicatorValue") %>% 
      group_split(statenames) %>%
        set_names(nm=map_chr(.,\(x) first(x$statenames) ))
    
    
    PLOT_LIST <- map(DATA_LIST, \(x) ggplot(x, aes(x=indicatorName, y=indicatorValue))+
                       geom_col()+labs(title = first(x$statenames)))
    
    walk2(PLOT_LIST,names(PLOT_LIST) , \(x,y) ggsave(plot=x, filename = paste0("./",y,"_plot.pdf" )) )
    

    The used walk instead of map as don't need to save the result as object, but map2 would also work and return a list of the plot names in console.