I have this plot which is too crowded to be useful:
ggplot(data = meandist.SG, aes(x = starttime,y = meandist)) + #set main plot variables
geom_ribbon(aes(ymin=meandist-se, ymax=meandist+se, fill=mapped), alpha=0.1) + #add standard error
geom_line(aes(colour = mapped),alpha = 1) + #add a line for each group
labs(title = "Comparison of Groups", x = "time (s)", y = "mean distance (mm)") #set title, and axis labels
I can make a plot for each pair of groups by wrapping the following in mlply and passing in the possible group pairs. But this means I can't easily see all the plots at the same time.
ggplot(data = subset(meandist.SG, mapped %in% c('a', 'f')) ,aes(x = starttime,y = meandist)) + #set main plot variables
geom_ribbon(aes(ymin=meandist-se, ymax=meandist+se, fill=mapped), alpha=0.1) + #add standard error to main plot
geom_line(aes(colour = mapped),alpha = 1,size = 1) + #plot a line on main plot for each group
labs(title = 'GroupA and GroupB, Distance over Time', x = "time (s)", y = "mean distance (mm)")
What I'd like to do is create a single image with the paired group plots arranged like a pairplot with the mapped
factor as the diagonal.
The data looks like this:
> str(meandist.SG)
'data.frame': 2400 obs. of 4 variables:
$ starttime: num 0 0 0 0 0 0 0 0 60 60 ...
$ mapped : Factor w/ 8 levels "rowA","rowB",..: 1 2 3 4 5 6 7 8 1 2 ...
$ meandist : num 123.2 115 91.9 112.8 108.6 ...
$ se : num 8.95 9.54 9.57 9.86 11.96 ...
> head(meandist.SG)
starttime mapped meandist se
1 0 rowA 123.1739 8.952757
2 0 rowB 114.9875 9.544961
3 0 rowC 91.8875 9.571005
4 0 rowD 112.7583 9.861424
5 0 rowE 108.5826 11.962127
6 0 rowF 126.4917 9.331622
I'm thinking I should use the GGally package, but I can't figure out how to use the levels of a factor as the diagonal. Ideas?
If I understand you correctly, here is a solution using facets. I had to generate a demo dataset because your sample is not nearly sufficient.
library(ggplot2)
library(data.table)
library(plyr)
# this generates the demo dataset - you have this already
set.seed(1)
df <- do.call(rbind,lapply(1:8,function(i){
data.frame(starttime=seq(0,20000,100),
mapped=LETTERS[i],
meandist=100*i+rnorm(201,0,20),
se=50)
}))
# you start here...
dt=data.table(df)
setnames(dt,c("starttime","mapped","meandist","se"),c("x","H","y.H","se.H"))
setkey(dt,x)
gg <- dt[,list(V=H,y.V=y.H,se.V=se.H),key="x"]
gg <- dt[gg, allow.cartesian=T]
ggp <- ggplot(gg,aes(x=x))
ggp <- ggp + geom_line(aes(y=y.H, color=H))
ggp <- ggp + geom_line(subset=.(H!=V), aes(y=y.V, color=V))
ggp <- ggp + geom_ribbon(aes(ymin=y.H-se.H, ymax=y.H+se.H, fill=H), alpha=0.1)
ggp <- ggp + geom_ribbon(aes(ymin=y.V-se.V, ymax=y.V+se.V, fill=V), alpha=0.1)
ggp <- ggp + facet_grid(V~H, scales="free")
ggp <- ggp + guides(fill=guide_legend("mapped"),color=guide_legend("mapped"))
ggp <- ggp + theme(axis.text.x=element_text(angle=-90,vjust=.2, hjust=0))
ggp <- ggp + labs(x="Start Time",y="Mean Distance")
print(ggp)
This creates a faceted pair-wise plot of meandist
vs. starttime
for each pair of groups (`mapped'). Note that you get two copies of each plot (above and below the diagonal).
This approach basically creates two copies of the dataset and does a Cartesian join on the x-variable (starttime). I use data tables because the join is much more efficient and the code is more compact. I renamed the columns for convenience.