Currently I am learning ANCOVA, but I'm confused with the result.
I created a linear regression model using mtcars
like this:
summary(lm(qsec ~ wt+factor(am), data = mtcars))
The output is:
Call:
lm(formula = qsec ~ wt + factor(am), data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-2.6898 -1.3063 0.0167 1.1398 3.9917
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.5990 1.5596 14.490 8.17e-15 ***
wt -1.1716 0.4025 -2.911 0.00685 **
factor(am)1 -2.4141 0.7892 -3.059 0.00474 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.582 on 29 degrees of freedom
Multiple R-squared: 0.267, Adjusted R-squared: 0.2165
F-statistic: 5.283 on 2 and 29 DF, p-value: 0.01106
As you see, the p value of wt
showed 0.00685, which meaned a strong linear correlation between wt
and qsec
, as well as am
.
But when I ran aov
code:
summary(aov(qsec ~ wt+factor(am), data = mtcars))
With the output:
Df Sum Sq Mean Sq F value Pr(>F)
wt 1 3.02 3.022 1.208 0.28081
factor(am) 1 23.41 23.413 9.358 0.00474 **
Residuals 29 72.55 2.502
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
It seems like there was no effect from wt
on qsec
.
Does it mean that a strong linear correlation between wt
and qsec
could be confirmed but there is no great effect from wt
on qsec
?
Is my explanation appropriate?
First drop the factor
since am
only has two values so making it a factor will not have any effect on the results.
Now regarding the tests to get the p values they are different. For lm
the wt
p value is based on the comparison of these two models
qsec ~ am
qsec ~ wt + am
so we have
anova(lm(qsec ~ am, mtcars), lm(qsec ~ mt + am, mtcars))
## Model 1: qsec ~ am
## Model 2: qsec ~ wt + am
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 30 93.758
## 2 29 72.554 1 21.204 8.4753 0.006854 ** <-- 0.00685
summary(lm(qsec ~ wt + am, mtcars))
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.5990 1.5596 14.490 8.17e-15 ***
## wt -1.1716 0.4025 -2.911 0.00685 ** <-- 0.00685
## am -2.4141 0.7892 -3.059 0.00474 **
whereas aov
is really meant for balanced designs where the terms are orthogonal and if not, as here, then it conceptually orthogonalizes them sequentially so in that case the comparison is effectively between these two models
qsec ~ r.am
qsec ~ r.wt + r.am
where r.wt is the portion of wt
orthogonal to the intercept and r.am is the portion of am
orthogonal to wt
and the intercept so we have
r.wt <- resid(lm(wt ~ 1, mtcars))
r.am <- resid(lm(am ~ wt, mtcars))
anova(lm(qsec ~ r.am, mtcars), lm(qsec ~ r.wt + r.am, mtcars))
## Model 1: qsec ~ r.am
## Model 2: qsec ~ r.wt + r.am
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 30 75.576
## 2 29 72.554 1 3.0217 1.2078 0.2808 <------- 0.2808
summary(aov(qsec ~ wt + am, mtcars))
## Df Sum Sq Mean Sq F value Pr(>F)
## wt 1 3.02 3.022 1.208 0.28081 <------- 0.28081
## am 1 23.41 23.413 9.358 0.00474 **
## Residuals 29 72.55 2.502
It would also be possible to demonstrate this by performing Gram Schmidt on the cbind(1, wt, am)
model matrix to make the columns orthogonal. The pracma package has a Gram Schmidt routine.