Search code examples
ranova

How to force R to give the SSR in a single line for all covariates in an ANOVA table?


Background and What I have Tried

Suppose I have a model with 2 numeric covariates and 1 categorical covariate with 3 factor levels for a response:

set.seed(1)
Y  <- sample(100)

n <- 100
X1 <- sample(n)
X2 <- sample(n)
X3 <- as.factor(rep(c("A", "B", "C", "D"), n/4))

model <- lm(Y ~ X1 + X2 + X3)

If I use the anova function, I get the following result:

> anova(model)
Analysis of Variance Table

Response: Y
          Df Sum Sq Mean Sq F value Pr(>F)
X1         1   1029 1028.85  1.1896 0.2782
X2         1    645  645.41  0.7462 0.3899
X3         3    351  116.87  0.1351 0.9389
Residuals 94  81300  864.89

I can also use the aov and summary functions to obtain a similar result:

model_aov <- aov(Y ~ X1 + X2 + X3)
> anova(model_aov)
Analysis of Variance Table

Response: Y
          Df Sum Sq Mean Sq F value Pr(>F)
X1         1   1029 1028.85  1.1896 0.2782
X2         1    645  645.41  0.7462 0.3899
X3         3    351  116.87  0.1351 0.9389
Residuals 94  81300  864.89 

Desired Result

I would like to aggregate all covariates into a single line for the Sum Square Regression (SSR).

Is it possible to obtain an ANOVA table which will look like the following:

Response: Y
               Df Sum Sq Mean Sq F value Pr(>F)
X1 + X2 + X3    5   2025     405       f p
Residuals      94  81300  864.89  

where f and p are calculated within R?


Solution

  • This can be done by using the design matrix in the linear model.

    model <- lm(Y ~ X1 + X2 + X3)
    
    design_matrix = model.matrix(model)
    

    The anova command will now give the following output

    > anova(lm(Y ~ design_matrix))
    Analysis of Variance Table
    
    Response: Y
                  Df Sum Sq Mean Sq F value Pr(>F)
    design_matrix  5   2025  404.98  0.4682  0.799
    Residuals     94  81300  864.89