I have hourly temperature data from an old experiment. I would like to summarize the dataset into a graph in ggplot showing the mean temperature of each experimental treatment, and the variation in temperatures within each treatment every hour. I would like to have a non-excel graph that looks something like this:
The data are linked here. https://www.dropbox.com/sh/27cft3118dha6xt/46_xxZZano
I probably have to use reshape to format the data correctly. JD refers to the Julian Day of the year and Time is the hour within that day. Note that labels A-H are treatment 1, I –P are treatment 2 and Q-X are treatment 3. Any advice on how to best go about this would be greatly appreciated.
Many thanks.
Maybe like this:
df<-read.csv(file="2011_Temps_obs.csv")
require(reshape2)
require(ggplot2)
require(dplyr) # for aggregation
df$Hour<-as.character( # load "Hour" in correct format
strptime(
paste(df$Year,df$JD,substr(
formatC(df$Time, width = 4, format = "d", flag = "0"),
1,2),sep="-"),
format="%Y-%j-%H"
))
m<-melt(df,id.vars="Hour") # melt by hour
m<-m[!(m$variable %in% c("Year","JD","Time")),] # filter out un-needed columns
lookup<-data.frame(variable=unique(m$variable),test=c(rep(1,5),rep(2,8),rep(3,25)))
ggplot(merge(m,lookup,by="variable")) + # merge m to get the test rollup
geom_smooth(aes(x=Hour,y=value,group=as.factor(test),fill=as.factor(test),color=as.factor(test)))
This is the smoothed graph with CIs
OR this pre-calculating your own summary stats using dplyr
summdata<-
merge(m,lookup,by="variable") %.%
group_by(Hour,test) %.%
summarise(mean=mean(value),min=min(value),max=max(value))
ggplot(summdata,aes(group=as.factor(test), color=as.factor(test), fill=as.factor(test))) +
geom_line(aes(x=Hour,y=mean),size=1,alpha=0.6) +
geom_ribbon(aes(x=Hour,ymin=min,ymax=max),alpha=0.1)