Search code examples
rlmcontrast

Why does changing contrast type change row labels in R lm summary?


With the default contrasts in R (contr.treatment), the summary of a linear model object gives row names according to the level names. When I change the contrasts to contr.sum, the summary of the linear model object gives row names according to made up numbers.

For the example code below, the row names for treatment contrasts are xa xb xc xd xe, for sum contrasts they are x1 x2 x3 x4 x5.

Is there a way to make these behave the same way besides manually renaming the rows?

EXAMPLE:

y <- rnorm(10, 0, 1)
x <- factor(rep(letters[1:5], each = 2))

options(contrasts = c("contr.treatment", "contr.poly"))
summary(lm(y ~ x))

options(contrasts = c("contr.sum", "contr.poly"))
summary(lm(y ~ x))

Solution

  • I'm still not at all sure this is a good idea, I think the possibility of getting confused about what the contrasts mean is too high. Still, what I would do is to make a new contrasts function that computes sum contrasts but sets the names equal the default names from the treatment contrasts.

    set.seed(5)
    n <- 5
    y <- c(10 + rnorm(n, 0, 1), 20 + rnorm(n, 0, 1), 30 + rnorm(n, 0, 1))
    wFactor <- as.factor(c(rep("A", n), rep("B", n), rep("C", n)))
    
    contr.sumX <- function(...) {
      conT <- contr.treatment(...)
      conS <- contr.sum(...)
      colnames(conS) <- colnames(conT)
      conS
    }
    

    For reference, here's the usual output:

    > m1 <- lm(y ~ wFactor, contrasts = list(wFactor=contr.sum(n = levels(wFactor))))
    > coef(summary(m1))
                  Estimate Std. Error     t value     Pr(>|t|)
    (Intercept) 19.8218432  0.2481727  79.8711599 9.889455e-18
    wFactor1    -9.6079241  0.3509692 -27.3754029 3.480430e-12
    wFactor2    -0.1934654  0.3509692  -0.5512319 5.915907e-01
    

    And here's the output with the contr.sumX function.

    > m2 <- lm(y ~ wFactor, contrasts = list(wFactor=contr.sumX(n = levels(wFactor))))
    > coef(summary(m2))
                  Estimate Std. Error     t value     Pr(>|t|)
    (Intercept) 19.8218432  0.2481727  79.8711599 9.889455e-18
    wFactorB    -9.6079241  0.3509692 -27.3754029 3.480430e-12
    wFactorC    -0.1934654  0.3509692  -0.5512319 5.915907e-01
    

    Alternately, you can set the contrasts for a particular factor ahead of time:

    contrasts(wFactor) <- "contr.sumX"
    m3 <- lm(y ~ wFactor)
    > coef(summary(m3))
                  Estimate Std. Error     t value     Pr(>|t|)
    (Intercept) 19.8218432  0.2481727  79.8711599 9.889455e-18
    wFactorB    -9.6079241  0.3509692 -27.3754029 3.480430e-12
    wFactorC    -0.1934654  0.3509692  -0.5512319 5.915907e-01