I am doing a submodel testing. The smaller model is nested in the bigger model. The bigger model has one continuous variable compared to the smaller model. I use the likelihood ratio test. The result is quite strange. Both models have the same statistics such as residual deviance and df. I also find two models have the same estimated coefficients are std.errors. How is the fact possible?
summary(m2221)
Call:
glm(formula = clm ~ veh_age + veh_body + agecat + veh_value:veh_age +
veh_value:area, family = "binomial", data = Car)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.9245 -0.3939 -0.3683 -0.3437 2.7095
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.294118 0.382755 -3.381 0.000722 ***
veh_age2 0.051790 0.098463 0.526 0.598897
veh_age3 -0.166801 0.094789 -1.760 0.078457 .
veh_age4 -0.239862 0.096154 -2.495 0.012611 *
veh_bodyCONVT -2.184124 0.707884 -3.085 0.002033 **
veh_bodyCOUPE -0.850675 0.393625 -2.161 0.030685 *
veh_bodyHBACK -1.105087 0.374134 -2.954 0.003140 **
veh_bodyHDTOP -0.973472 0.383404 -2.539 0.011116 *
veh_bodyMCARA -0.649036 0.469407 -1.383 0.166765
veh_bodyMIBUS -1.295135 0.404691 -3.200 0.001373 **
veh_bodyPANVN -0.903032 0.395295 -2.284 0.022345 *
veh_bodyRDSTR -1.108488 0.826541 -1.341 0.179883
veh_bodySEDAN -1.097931 0.373578 -2.939 0.003293 **
veh_bodySTNWG -1.129122 0.373713 -3.021 0.002516 **
veh_bodyTRUCK -1.156099 0.384088 -3.010 0.002613 **
veh_bodyUTE -1.343958 0.377653 -3.559 0.000373 ***
agecat2 -0.198002 0.058382 -3.391 0.000695 ***
agecat3 -0.224492 0.056905 -3.945 7.98e-05 ***
agecat4 -0.253377 0.056774 -4.463 8.09e-06 ***
agecat5 -0.441906 0.063227 -6.989 2.76e-12 ***
agecat6 -0.447231 0.072292 -6.186 6.15e-10 ***
veh_age1:veh_value -0.000637 0.026387 -0.024 0.980740
veh_age2:veh_value 0.035386 0.031465 1.125 0.260753
veh_age3:veh_value 0.114485 0.036690 3.120 0.001806 **
veh_age4:veh_value 0.189866 0.057573 3.298 0.000974 ***
veh_value:areaB 0.044099 0.021550 2.046 0.040722 *
veh_value:areaC 0.021892 0.019189 1.141 0.253931
veh_value:areaD -0.023616 0.024939 -0.947 0.343658
veh_value:areaE -0.013506 0.026886 -0.502 0.615415
veh_value:areaF 0.057780 0.026602 2.172 0.029850 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 33767 on 67855 degrees of freedom
Residual deviance: 33592 on 67826 degrees of freedom
AIC: 33652
Number of Fisher Scoring iterations: 5
summary(m222)
Call:
glm(formula = clm ~ veh_value + veh_age + veh_body + agecat +
veh_value:veh_age + veh_value:area, family = "binomial",
data = Car)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.9245 -0.3939 -0.3683 -0.3437 2.7095
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.294118 0.382755 -3.381 0.000722 ***
veh_value -0.000637 0.026387 -0.024 0.980740
veh_age2 0.051790 0.098463 0.526 0.598897
veh_age3 -0.166801 0.094789 -1.760 0.078457 .
veh_age4 -0.239862 0.096154 -2.495 0.012611 *
veh_bodyCONVT -2.184124 0.707884 -3.085 0.002033 **
veh_bodyCOUPE -0.850675 0.393625 -2.161 0.030685 *
veh_bodyHBACK -1.105087 0.374134 -2.954 0.003140 **
veh_bodyHDTOP -0.973472 0.383404 -2.539 0.011116 *
veh_bodyMCARA -0.649036 0.469407 -1.383 0.166765
veh_bodyMIBUS -1.295135 0.404691 -3.200 0.001373 **
veh_bodyPANVN -0.903032 0.395295 -2.284 0.022345 *
veh_bodyRDSTR -1.108488 0.826541 -1.341 0.179883
veh_bodySEDAN -1.097931 0.373578 -2.939 0.003293 **
veh_bodySTNWG -1.129122 0.373713 -3.021 0.002516 **
veh_bodyTRUCK -1.156099 0.384088 -3.010 0.002613 **
veh_bodyUTE -1.343958 0.377653 -3.559 0.000373 ***
agecat2 -0.198002 0.058382 -3.391 0.000695 ***
agecat3 -0.224492 0.056905 -3.945 7.98e-05 ***
agecat4 -0.253377 0.056774 -4.463 8.09e-06 ***
agecat5 -0.441906 0.063227 -6.989 2.76e-12 ***
agecat6 -0.447231 0.072292 -6.186 6.15e-10 ***
veh_value:veh_age2 0.036023 0.034997 1.029 0.303331
veh_value:veh_age3 0.115122 0.039476 2.916 0.003543 **
veh_value:veh_age4 0.190503 0.058691 3.246 0.001171 **
veh_value:areaB 0.044099 0.021550 2.046 0.040722 *
veh_value:areaC 0.021892 0.019189 1.141 0.253931
veh_value:areaD -0.023616 0.024939 -0.947 0.343658
veh_value:areaE -0.013506 0.026886 -0.502 0.615415
veh_value:areaF 0.057780 0.026602 2.172 0.029850 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 33767 on 67855 degrees of freedom
Residual deviance: 33592 on 67826 degrees of freedom
AIC: 33652
anova(m2221,m222, test ="LRT")###
Analysis of Deviance Table
Model 1: clm ~ veh_age + veh_body + agecat + veh_value:veh_age +
veh_value:area
Model 2: clm ~ veh_value + veh_age + veh_body + agecat + veh_value:veh_age +
veh_value:area
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1 67826 33592
2 67826 33592 0 0
You have specified the same model two different ways. To show this, I'll first explain what is going under the hood a bit, then I'll walk through the coefficients of your two models to show they are the same, and I'll end with some higher level intuition and explanation.
First, the only difference between your two models is the first models includes veh_value
as a predictor, whereas in the second does not. However, veh_value
is interacted with other predictors in both models.
So let's consider a simple reproducible example and see what R does when we do this. I'll use the model.matrix
to simply feed in two different formulas and view the continuous and dummy variables that R creates as a result.
colnames(model.matrix(am ~ mpg + factor(cyl):mpg, mtcars))
#[1] "(Intercept)" "mpg" "mpg:factor(cyl)6" "mpg:factor(cyl)8"
colnames(model.matrix(am ~ factor(cyl):mpg, mtcars))
#"(Intercept)" "factor(cyl)4:mpg" "factor(cyl)6:mpg" "factor(cyl)8:mpg"
Notice in the first call I included the continuous predictor mpg
whereas I did not include it in the second call (a simpler example of what you are doing).
Now note that the second model matrix contains an extra interaction variable (factor(cyl)4:mpg
) in the model that the first does not. In other words, because we did not include mpg
in the model directly, all levels of cyl
get included in the interaction.
Your models are essentially doing the same thing as the simple example above and we can show that the coefficients at the end of the day are actually the same when added together.
In your first model all 4 levels of veh_age
are included when it is included as interaction with veh_value
but veh_value
is not included in the model.
veh_age1:veh_value -0.000637 0.026387 -0.024 0.980740
veh_age2:veh_value 0.035386 0.031465 1.125 0.260753
veh_age3:veh_value 0.114485 0.036690 3.120 0.001806 **
veh_age4:veh_value 0.189866 0.057573 3.298 0.000974 ***
In your second model only 3 levels of veh_age
are included when it is interacted with veh_value
because veh_value
is included in the model.
veh_value:veh_age2 0.036023 0.034997 1.029 0.303331
veh_value:veh_age3 0.115122 0.039476 2.916 0.003543 **
veh_value:veh_age4 0.190503 0.058691 3.246 0.001171 **
Now, here is the critical piece to see that the models are actually the same. It's easiest to show by just walking through all of the levels of veh_age
.
First consider veh_age = 1
For both models, the coefficient on veh_value
conditioned on veh_age
when veh_age = 1
is -0.000637
# For first model
veh_age1:veh_value -0.000637 0.026387 -0.024 0.980740
# For second model
veh_value -0.000637 0.026387 -0.024 0.980740
Now consider veh_age = 2
For both models, the coefficient on veh_value
conditioned on veh_age
when veh_age = 2
is 0.035386
# For first model
veh_age2:veh_value 0.035386 0.031465 1.125 0.260753
# For second model - note sum of the two below is 0.035386
veh_value -0.000637 0.026387 -0.024 0.980740
veh_value:veh_age2 0.036023 0.034997 1.029 0.303331
When you include the interaction veh_value:veh_age
you are essentially saying that you want the coefficient of veh_value
a continuous variable to be conditioned on veh_age
, a categorical variable. Including both the interaction veh_value:veh_age
and veh_value
as predictors is saying the same thing. You want to know the coefficient of veh_value
but you want to condition it based on the value of veh_age
.