I'm using the R package MuMIn
to do multimodel inference and the function model.avg
to average the coefficients estimated by a set of models. To visually compare the data to the estimated relationships based on the averaged coefficients, I want to use partial residual plots, similar to the ones created by the crPlots
function of the car
package. I've tried three ways and I'm not sure whether any is appropriate. Here is a demonstration.
library(MuMIn)
# Loading the data
data(Cement)
# Creating a full model with all the covariates we are interested in
fullModel <- lm(y ~ ., data = Cement, na.action=na.fail)
# Getting all possible models based on the covariates of the full model
muModel <- dredge(fullModel)
# Averaging across all models
avgModel <- model.avg(muModel)
# Getting the averaged coefficients
coefMod <- coef(avgModel)
coefMod
# (Intercept) X1 X2 X4 X3
# 65.71487660 1.45607957 0.61085531 -0.49776089 -0.07148454
Option 1: Using crPlots
library(car) # For crPlots
# Creating a duplicate of the fullMode
hackModel <- fullModel
# Changing the coefficents to the averaged coefficients
hackModel$coefficients <- coefMod[names(coef(fullModel))]
# Changing the residuals
hackModel$residuals <- Cement$y - predict(hackModel)
# Plot the hacked model vs the full model
layout(matrix(1:8, nrow=2, byrow=TRUE))
crPlots(hackModel, layout=NA)
crPlots(fullModel, layout=NA)
Notice that the crPlots of the full and hacked versions with the average coefficients are different.
The question here is: Is this appropriate? The results rely on a hack that I found in this answer. Do I need to change parts of the model other than the residuals and the coefficients?
Option 2: Homemade plots
# Partial residuals: residuals(hacked model) + beta*x
# X1
# Get partial residuals
prX1 <- resid(hackModel) + coefMod["X1"]*Cement$X1
# Plot the partial residuals
plot(prX1 ~ Cement$X1)
# Add modeled relationship
abline(a=0,b=coefMod["X1"])
# X2 - X4
plot(resid(hackModel) + coefMod["X2"]*X2 ~ X2, data=Cement); abline(a=0,b=coefMod["X2"])
plot(resid(hackModel) + coefMod["X3"]*X3 ~ X3, data=Cement); abline(a=0,b=coefMod["X3"])
plot(resid(hackModel) + coefMod["X4"]*X4 ~ X4, data=Cement); abline(a=0,b=coefMod["X4"])
The plot looks different from the ones produced by the crPlots
above.
The partial residuals have similar patterns, but their values and modeled relationships are different. The difference in values appears to due to the fact that crPlots used centered partial residuals (see this answer for a discussion of partial residuals in R). This brings me to my third option.
Option 3: Homemade plots with centered partial residuals
# Get the centered partial residuals
pRes <- resid(hackModel, type='partial')
# X1
# Plot the partial residuals
plot(pRes[,"X1"] ~ Cement$X1)
# Plot the component - modeled relationship
lines(coefMod["X1"]*(X1-mean(X1))~X1, data=Cement)
# X2 - X4
plot(pRes[,"X2"] ~ Cement$X2); lines(coefMod["X2"]*(X2-mean(X2))~X2, data=Cement)
plot(pRes[,"X3"] ~ Cement$X3); lines(coefMod["X3"]*(X3-mean(X3))~X3, data=Cement)
plot(pRes[,"X4"] ~ Cement$X4); lines(coefMod["X4"]*(X4-mean(X4))~X4, data=Cement)
Now we have similar values than the crPlots
above, but the relationships are still different. The difference may be related to the intercepts. But I'm not sure what I should use instead of 0.
Any suggestions of which method is more appropriate? Is there a more straightforward way to get the partial residual plots based on model averaged coefficients?
Many thanks!
From looking at the crPlot.lm
source code, it looks like only the functions residuals(model, type="partial")
, predict(model, type="terms", term=var)
, and functions associated with finding the names of the variables are used on the model object. It also looks like the relationship is regressed, as @BenBolker suggested. The code used in crPlot.lm
is: abline(lm(partial.res[,var]~.x), lty=2, lwd=lwd, col=col.lines[1])
. Thus, I think that changing the coefficients and residuals of the model is sufficient to be able to use crPlots
on it. I can now also reproduce the results in a homemade way.
library(MuMIn)
# Loading the data
data(Cement)
# Creating a full model with all the covariates we are interested in
fullModel <- lm(y ~ ., data = Cement, na.action=na.fail)
# Getting all possible models based on the covariates of the full model
muModel <- dredge(fullModel)
# Averaging across all models
avgModel <- model.avg(muModel)
# Getting the averaged coefficients
coefMod <- coef(avgModel)
# Option 1 - crPlots
library(car) # For crPlots
# Creating a duplicate of the fullMode
hackModel <- fullModel
# Changing the coefficents to the averaged coefficient
hackModel$coefficients <- coefMod[names(coef(fullModel))]
# Changing the residuals
hackModel$residuals <- Cement$y - predict(hackModel)
# Plot the crPlots and the regressed homemade version
layout(matrix(1:8, nrow=2, byrow=TRUE))
par(mar=c(3.5,3.5,0.5,0.5), mgp=c(2,1,0))
crPlots(hackModel, layout=NA, ylab="Partial Res", smooth=FALSE)
# Option 4 - Homemade centered and regressed
# Get the centered partial residuals
pRes <- resid(hackModel, type='partial')
# X1 - X4 plot partial residuals and used lm for the relationship
plot(pRes[,"X1"] ~ Cement$X1); abline(lm(pRes[,"X1"]~Cement$X1))
plot(pRes[,"X2"] ~ Cement$X2); abline(lm(pRes[,"X2"]~Cement$X2))
plot(pRes[,"X3"] ~ Cement$X3); abline(lm(pRes[,"X3"]~Cement$X3))
plot(pRes[,"X4"] ~ Cement$X4); abline(lm(pRes[,"X4"]~Cement$X4))