I have seen different variations of this question, but none that are straight-forward in answering a problem I come across all of the time. I often have large datasets like the one described in this link:
Make multiple separate plots from single data frame in R Example provided:
head(data)
Park_name Zone Year Height_mm
1 Park1 Zone1 2011 380
2 Park1 Zone1 2011 510
3 Park1 Zone1 2011 270
4 Park1 Zone2 2011 270
5 Park1 Zone2 2011 230
6 Park1 Zone2 2011 330
# load packages
require(ggplot2)
require(plyr)
# read data
Y <- read.table("C:/data.csv", sep=",", header=TRUE)
# define the theme
th <- theme_bw() +
theme(axis.text.x=element_text(),
axis.line=element_line(colour="black"),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.background=element_blank(),
legend.justification=c(10,10), legend.position=c(10,10),
legend.title = element_text(),
legend.key = element_blank()
)
# determine park levels
parks <- levels(Y[,"Park_name"])
# apply seperately for each park
p <- lapply(parks, function(park) {
ggplot(Y[Y[, "Park_name"]==park,], aes(x=as.factor(Year), y=Height_mm)) +
facet_grid(Zone~.) + # show each zone in a seperate facet
geom_point() + # plot the actual heights (if desired)
# plot the mean and confidence interval
stat_summary(fun.data="mean_cl_boot", color="red")
})
# finally print your plots
lapply(p, function(x) print(x+th))
I want to create a singular plot to put in a report appendices for each Park's Zone, plotting year x height. Sometimes this totals over 100 plots. I do not want to facet wrap. I want the plots uniquely individual and it would be great to save jpegs automatically to a designated folder. I also want each plot to uniquely record: 1. A unique y-axis title. (let's say the height column had values in both feet and meters and you needed figures to identify which one. 2. A unique main-title based off the Park Name and Zone.
This is a huge challenge for me but may be an easy coding problem for someone who uses code so often. I would be eternally grateful for help, since I need this type of loop all of the time. Thank you!
I think the main problem with the example you provided is that the loop is made over the "parks" vector, which only contains the levels of "Park_name". I think a better approach would be to loop over the data, subsetting by each "Park_name" entry.
I am also assuming that you have a column with the "units" variable (I added it in the plot as "Units"); however, if that is not the case, you may be able to create it using dplyr::separate
. I hope you find this code useful!
# determine park levels
parks <- unique(data[,"Park_name"])
# lapply for each park entry
p <- lapply(parks, function(park) {
#Subset the data by the each entry in the parks vector
subdata <- subset(data,data$Park_name == park)
#Collapse the zone vector as a string
zones <- paste(unique(subdata[,"Zone"]),
collapse = " ")
##ggplot
ggplot(subdata, aes(x=as.factor(Year), y=Height_mm)) +
facet_grid(Zone~.) +
geom_point() +
#Add the title and y lab as variables defined by park name, zones and a column with unit information
labs(title = paste(subdata$Park_name, zones, sep = " "),
y = paste0("Height (", subdata$Units,")"),
x = "Year") +
stat_summary(fun.data="mean_cl_boot", color="red")
#Save the plot, define your folder location as "C:/myplots/"
ggsave(filename = paste0(folder, park,".jpeg"),
device = "jpeg",
width = 15,
height = 10,
units = "cm",
dpi = 200)
})