I am trying to generate a plot derived from my data and a multiple regression model run on these data. I am having trouble getting everything that I need plotted together within one graph however (i.e, raw data points + fitted lines + 95% CI's). In this model, there is a polynomial numeric predictor that interacts with a factor (three levels), so I will have three fitted lines in the plot.
#Here are the data
list<-c(60,75,90,120,180)
x1<-rep(list,each=3,times=3) #predictor 1
x2<-rep(seq(1:3),each=15) #predictor 2
y<-c(72, 63, 58, 56, 50, 52, 47, 48, 51, 41, 47, 44, 38, 34, 36, 92, 93, 88, 76, 76, 74, 72, 67, 78, 56, 71, 65, 53, 56, 60, 93, 73, 77, 96, 79, 81, 84, 79, 86, 80, 76, 75, 69, 61, 63)
df<-data.frame(cbind(x1,x2,y)) #combine vectors into data frame
df$x2<-factor(df$x2) #make x2 a factor
#here is the model
mod<-lm(y~poly(x1,2)*x2,data=df)
I prefer to start from an empty plot and build up from there. I can easily plot the raw data, but I'm not sure how to get the fitted lines and 95% confidence intervals from the model added in here.
#create empty plot
plot(y~x1,xlim=c(min(x1),max(x1)),ylim=c(min(y),max(y)),
type="n",xlab="x1",ylab="y",data=df)
#add data points
points(y[x2==1]~x1[x2==1], cex=1.7,pch=21,lwd=2,bg="gray10", data=df)
points(y[x2==2]~x1[x2==2], cex=1.7,pch=22,lwd=2,bg="gray40", data=df)
points(y[x2==3]~x1[x2==3], cex=1.7,pch=25,lwd=2,bg="gray80", data=df)
Through a lot of research and tinkering, I figured out how to get the fitted lines and 95% CI's using the 'effects' package, but I don't know how to add the raw data to this plot.
library(effects)
plot(allEffects(mod,xlevels=list(x1=min(x1):max(x1)),xlab="x1",ylab="y"),multiline=T,rug=F,ci.style="bands")
That was a lot of writing and code, but I hope what I'm trying to do is clear. Thanks you very much in advance for your help.
The predict
function handles all the messy calculations with the orthogonal polynomials:
x.two <- df$x2
lines(x = sort(x.two),
y = predict(mod, newdata=data.frame(x1=factor("1"), x2=sort(x.two) ) ) ,
col="red")
lines(x = sort(x.two),
y = predict(mod, newdata=data.frame(x1=factor("2"), x2=sort(x.two) ) ) ,
col="green")
lines(sort(x.two),
predict(mod, newdata=data.frame(x1=factor("3"),x2=sort(x.two) ) ) , col="orange")