Search code examples
rggplot2jpeglapply

Use apply() to ggplot() to create and save individual jpegs


I have seen different variations of this question, but none that are straight-forward in answering a problem I come across all of the time. I often have large datasets like the one described in this link:

Make multiple separate plots from single data frame in R Example provided:

head(data)
Park_name  Zone Year  Height_mm
1     Park1 Zone1 2011        380
2     Park1 Zone1 2011        510
3     Park1 Zone1 2011        270
4     Park1 Zone2 2011        270
5     Park1 Zone2 2011        230
6     Park1 Zone2 2011        330


# load packages
require(ggplot2)
require(plyr)
# read data 
Y <- read.table("C:/data.csv", sep=",", header=TRUE)
# define the theme
th <- theme_bw() +
  theme(axis.text.x=element_text(),  
        axis.line=element_line(colour="black"), 
        panel.grid.minor = element_blank(),
        panel.grid.major = element_blank(),
        panel.background=element_blank(),
        legend.justification=c(10,10), legend.position=c(10,10), 
        legend.title = element_text(),
        legend.key = element_blank()
    )
# determine park levels
parks <- levels(Y[,"Park_name"])
# apply seperately for each park
p <- lapply(parks, function(park) {
ggplot(Y[Y[, "Park_name"]==park,], aes(x=as.factor(Year), y=Height_mm)) +
  facet_grid(Zone~.) + # show each zone in a seperate facet
  geom_point() + # plot the actual heights (if desired)
  # plot the mean and confidence interval
  stat_summary(fun.data="mean_cl_boot", color="red") 
})       
# finally print your plots
lapply(p, function(x) print(x+th))

I want to create a singular plot to put in a report appendices for each Park's Zone, plotting year x height. Sometimes this totals over 100 plots. I do not want to facet wrap. I want the plots uniquely individual and it would be great to save jpegs automatically to a designated folder. I also want each plot to uniquely record: 1. A unique y-axis title. (let's say the height column had values in both feet and meters and you needed figures to identify which one. 2. A unique main-title based off the Park Name and Zone.

This is a huge challenge for me but may be an easy coding problem for someone who uses code so often. I would be eternally grateful for help, since I need this type of loop all of the time. Thank you!


Solution

  • I think the main problem with the example you provided is that the loop is made over the "parks" vector, which only contains the levels of "Park_name". I think a better approach would be to loop over the data, subsetting by each "Park_name" entry.

    I am also assuming that you have a column with the "units" variable (I added it in the plot as "Units"); however, if that is not the case, you may be able to create it using dplyr::separate. I hope you find this code useful!

    # determine park levels
    parks <- unique(data[,"Park_name"])
    
    # lapply for each park entry
    p <- lapply(parks, function(park) {
    
      #Subset the data by the each entry in the parks vector
      subdata <- subset(data,data$Park_name == park)
    
      #Collapse the zone vector as a string
      zones <- paste(unique(subdata[,"Zone"]),
                     collapse = " ")
      ##ggplot
      ggplot(subdata, aes(x=as.factor(Year), y=Height_mm)) +
        facet_grid(Zone~.) + 
        geom_point() + 
        #Add the title and y lab as variables defined by park name, zones and a column with unit information
        labs(title = paste(subdata$Park_name, zones, sep = " "),
             y = paste0("Height (", subdata$Units,")"),
             x = "Year") +
    
        stat_summary(fun.data="mean_cl_boot", color="red")
    
      #Save the plot, define your folder location as "C:/myplots/"
      ggsave(filename = paste0(folder, park,".jpeg"),
             device = "jpeg",
             width = 15,
             height = 10,
             units = "cm",
             dpi = 200)
    })