Search code examples
rplotregressionquadratic-curve

R: generate plot for multiple regression model with interaction between polynomial numeric predictor and factor


I am trying to generate a plot derived from my data and a multiple regression model run on these data. I am having trouble getting everything that I need plotted together within one graph however (i.e, raw data points + fitted lines + 95% CI's). In this model, there is a polynomial numeric predictor that interacts with a factor (three levels), so I will have three fitted lines in the plot.

#Here are the data
list<-c(60,75,90,120,180)
x1<-rep(list,each=3,times=3) #predictor 1
x2<-rep(seq(1:3),each=15) #predictor 2
y<-c(72, 63, 58, 56, 50, 52, 47, 48, 51, 41, 47, 44, 38, 34, 36, 92, 93, 88, 76, 76, 74, 72, 67, 78, 56, 71, 65, 53, 56, 60, 93, 73, 77, 96, 79, 81, 84, 79, 86, 80, 76, 75, 69, 61, 63)
df<-data.frame(cbind(x1,x2,y)) #combine vectors into data frame
df$x2<-factor(df$x2) #make x2 a factor

#here is the model
mod<-lm(y~poly(x1,2)*x2,data=df)

I prefer to start from an empty plot and build up from there. I can easily plot the raw data, but I'm not sure how to get the fitted lines and 95% confidence intervals from the model added in here.

#create empty plot
plot(y~x1,xlim=c(min(x1),max(x1)),ylim=c(min(y),max(y)),
 type="n",xlab="x1",ylab="y",data=df)

#add data points
points(y[x2==1]~x1[x2==1], cex=1.7,pch=21,lwd=2,bg="gray10", data=df)
points(y[x2==2]~x1[x2==2], cex=1.7,pch=22,lwd=2,bg="gray40", data=df)
points(y[x2==3]~x1[x2==3], cex=1.7,pch=25,lwd=2,bg="gray80", data=df)

Through a lot of research and tinkering, I figured out how to get the fitted lines and 95% CI's using the 'effects' package, but I don't know how to add the raw data to this plot.

library(effects)
plot(allEffects(mod,xlevels=list(x1=min(x1):max(x1)),xlab="x1",ylab="y"),multiline=T,rug=F,ci.style="bands")

That was a lot of writing and code, but I hope what I'm trying to do is clear. Thanks you very much in advance for your help.


Solution

  • The predict function handles all the messy calculations with the orthogonal polynomials:

    x.two <- df$x2
    lines(x = sort(x.two),
          y = predict(mod, newdata=data.frame(x1=factor("1"), x2=sort(x.two) ) ) ,
          col="red")
     lines(x = sort(x.two),
           y = predict(mod, newdata=data.frame(x1=factor("2"), x2=sort(x.two) ) ) ,
           col="green")
     lines(sort(x.two),
           predict(mod,  newdata=data.frame(x1=factor("3"),x2=sort(x.two) ) ) , col="orange")
    

    enter image description here