Search code examples
rlme4contrast

lme4 deviant/tratment contrast coding with interactions in R - levels are missing


I have a mixed effects model (with lme4) with a 2-way interaction term, each term having multiple levels (each 4) and I would like to investigate their effects in reference to their grand mean. I present this example here from the car data set and omit the error term since it is not neccessary for this example:

## shorten data frame for simplicity
df=Cars93[c(1:15),]
df=Cars93[is.element(Cars93$Make,c('Acura Integra', 'Audi 90','BMW 535i','Subaru Legacy')),]
df$Make=drop.levels(df$Make)
df$Model=drop.levels(df$Model)

## define contrasts (every factor has 4 levels)
contrasts(df$Make) = contr.treatment(4)
contrasts(df$Model) = contr.treatment(4)

## model
m1 <- lm(Price ~ Model*Make,data=df)
summary(m1)

as you can see, the first levels are omitted in the interaction term. And I would like to have all 4 levels in the output, referenced to the grand mean (often referred to deviant coding). These are the sources I looked at: https://marissabarlaz.github.io/portfolio/contrastcoding/#coding-schemes and How to change contrasts to compare with mean of all levels rather than reference level (R, lmer)?. The last reference does not report interactions though.


Solution

  • The simple answer is that what you want is not possible directly. You have to use a slightly different approach.

    In a model with interactions, you want to use contrasts in which the mean is zero and not a specific level. Otherwise, the lower-order effects (i.e., main effects) are not main effects but simple effects (evaluated when the other factor level is at its reference level). This is explained in more details in my chapter on mixed models: http://singmann.org/download/publications/singmann_kellen-introduction-mixed-models.pdf

    To get what you want, you have to fit the model in a reasonable manner and then pass it to emmeans to compare against the intercept (i.e., the unweighted grand mean). This works also for interactions as shown below (as your code did not work, I use warpbreaks).

    afex::set_sum_contrasts() ## uses contr.sum globally
    library("emmeans")
    
    ## model
    m1 <- lm(breaks ~ wool * tension,data=warpbreaks)
    car::Anova(m1, type = 3)
    
    coef(m1)[1]
    # (Intercept) 
    #    28.14815 
    
    ## both CIs include grand mean:
    emmeans(m1, "wool") 
    #  wool emmean   SE df lower.CL upper.CL
    #  A      31.0 2.11 48     26.8     35.3
    #  B      25.3 2.11 48     21.0     29.5
    # 
    # Results are averaged over the levels of: tension 
    # Confidence level used: 0.95 
    
    ## same using test
    emmeans(m1, "wool", null = coef(m1)[1], infer = TRUE) 
    #   wool emmean   SE df lower.CL upper.CL null t.ratio p.value
    #  A      31.0 2.11 48     26.8     35.3 28.1  1.372  0.1764 
    #  B      25.3 2.11 48     21.0     29.5 28.1 -1.372  0.1764 
    # 
    # Results are averaged over the levels of: tension 
    # Confidence level used: 0.95 
    
    emmeans(m1, "tension", null = coef(m1)[1], infer = TRUE) 
    #  tension emmean   SE df lower.CL upper.CL null t.ratio p.value
    #  L         36.4 2.58 48     31.2     41.6 28.1  3.196  0.0025 
    #  M         26.4 2.58 48     21.2     31.6 28.1 -0.682  0.4984 
    #  H         21.7 2.58 48     16.5     26.9 28.1 -2.514  0.0154 
    # 
    # Results are averaged over the levels of: wool 
    # Confidence level used: 0.95 
    
    emmeans(m1, c("tension", "wool"), null = coef(m1)[1], infer = TRUE) 
    #  tension wool emmean   SE df lower.CL upper.CL null t.ratio p.value
    #  L       A      44.6 3.65 48     37.2     51.9 28.1  4.499  <.0001 
    #  M       A      24.0 3.65 48     16.7     31.3 28.1 -1.137  0.2610 
    #  H       A      24.6 3.65 48     17.2     31.9 28.1 -0.985  0.3295 
    #  L       B      28.2 3.65 48     20.9     35.6 28.1  0.020  0.9839 
    #  M       B      28.8 3.65 48     21.4     36.1 28.1  0.173  0.8636 
    #  H       B      18.8 3.65 48     11.4     26.1 28.1 -2.570  0.0133 
    # 
    # Confidence level used: 0.95 
    

    Note that for coef() you probably want to use fixef() for lme4 models.