Significant interaction in linear mixed effect model but plot shows overlapping confidence intervals?

I try to show you as much as possible of the structure of the data and the results I produced.

The structure of the data is the following:

   GroupID Person Factor2 Factor1 Rating
    <int>  <int>  <fctr>  <fctr>  <int>
1       2    109       2       0      1
2       2    109       2       1     -2
3       2    104       1       0      4
4       2    236       1       1      1
5       2    279       1       1      2
6       2    179       2       1      0

Person is the participant ID, GroupID is the kind of stimulus rated, Factor 1 (levels 0 and 1) and Factor 2 (levels 1 and 2) are fixed factors and the Ratings are the outcome variables.

I am trying to print a plot for a significant interaction in a linear mixed effect model. I used the packages lme4 and lmerTest to analyze the data.

This is the model we ran:

> model_interaction <- lmer(Rating ~ Factor1 * Factor2 + ( 1 | Person) +
(1 | GroupID), data)
> model_interaction

Linear mixed model fit by REML ['merModLmerTest']
Formula: Rating ~ Factor1 * Factor2 + (1 | Person) + (1 | GroupID)
Data: data
REML criterion at convergence: 207223.9

Random effects:
 Groups   Name        Std.Dev.
 Person   (Intercept) 1.036   
 GroupID  (Intercept) 1.786   
 Residual             1.880   
Number of obs: 50240, groups:  Person, 157; GroupID, 80
Fixed Effects:
  (Intercept)           Factor11           Factor22  Factor11:Factor22  
     -0.43823            0.01313            0.08568            0.12440

When I use the summary() function R returns the following output

> summary(model_interaction)

Linear mixed model fit by REML 
t-tests use  Satterthwaite approximations to degrees of freedom
['lmerMod']
Formula: Rating ~ Factor1 * Factor2 + (1 | Person) + (1 | GroupID)
   Data: data

REML criterion at convergence: 207223.9

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.8476 -0.6546 -0.0213  0.6516  4.2284 

Random effects:
 Groups   Name        Variance Std.Dev.
 Person   (Intercept) 1.074    1.036   
 GroupID  (Intercept) 3.191    1.786   
 Residual             3.533    1.880   
Number of obs: 50240, groups:  Person, 157; GroupID, 80

Fixed effects:
                    Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)       -4.382e-01  2.185e-01  1.110e+02  -2.006 0.047336 *  
Factor11           1.313e-02  2.332e-02  5.004e+04   0.563 0.573419    
Factor22           8.568e-02  6.275e-02  9.793e+03   1.365 0.172138    
Factor11:Factor22  1.244e-01  3.385e-02  5.002e+04   3.675 0.000238 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) Fctr11 Fctr22
Factor11    -0.047              
Factor22    -0.135  0.141       
Fctr11:Fc22  0.034 -0.694 -0.249

I know it is not possible to interpret p-Values for linear mixed effects model. So I ran an additional anova, comparing the interaction model to a model with just the main effects of Factor1 and Factor2

> model_Factor1_Factor2 = lmer(Rating ~ Factor1 + Factor2 +
  ( 1 | Person) + (1 | GroupID), data)
> anova(model_Factor1_Factor2, model_interaction)

Data: data
Models:
object: Rating ~ Factor1 + Factor2 + (1 | Person) + (1 | GroupID)
..1: Rating ~ Factor1 * Factor2 + (1 | Person) + (1 | GroupID)
       Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
object  6 207233 207286 -103611   207221                             
..1     7 207222 207283 -103604   207208 13.502      1  0.0002384 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I interpreted this Output as: the Interaction of Factor1 and Factor2 explains additional variance in my outcome measurement compared to the model with only the main effects of Factor1 and Factor2.

Since interpreting output for linear mixed effects models is hard I would like to print a graph showing the interaction of Factor1 and Factor2. I did so using lsmeans package (first I used the plot(allEffects) but after reading this How to get coefficients and their confidence intervals in mixed effects models? question I realized that it was not the right way to print graphs for linear mixed effect models).

So this is what I did (following this Website http://rcompanion.org/handbook/G_06.html)

> leastsquare = lsmeans(model_interaction, pairwise ~ Factor2:Factor1,
 adjust="bon")
> CLD = cld(leastsquare, alpha=0.05, Letters=letters, adjust="bon")
> CLD$.group=gsub(" ", "", CLD$.group)
> CLD
 Factor2 Factor1     lsmean        SE     df   lower.CL  upper.CL .group
 1       0       -0.4382331 0.2185106 111.05 -0.9930408 0.1165746  a    
 1       1       -0.4251015 0.2186664 111.36 -0.9803048 0.1301018  a    
 2       0       -0.3525561 0.2190264 112.09 -0.9086735 0.2035612  a    
 2       1       -0.2150234 0.2189592 111.95 -0.7709700 0.3409233   b   

Degrees-of-freedom method: satterthwaite 
Confidence level used: 0.95 
Conf-level adjustment: bonferroni method for 4 estimates 
P value adjustment: bonferroni method for 6 tests 
significance level used: alpha = 0.05

This is the plotting funtion I used

> ggplot(CLD, aes(`Factor1`, y = lsmean, ymax = upper.CL,
  ymin = lower.CL, colour = `Factor2`, group = `Factor2`)) +
  geom_pointrange(stat = "identity",
  position = position_dodge(width = 0.1)) +
  geom_line(position = position_dodge(width = 0.1))

The plot can be found using this link (I am not allowed to post images yet, please excuse the workaround)

Interaction of Factor1 and Factor2

Now my question is the following: Why do I have a significant interaction and a significant amount of explained variance by this interaction but my confidence intervals in the plot overlap? I guess I did something wrong with the confidence intervals? Or is it because it is just not possible to interpret the significance indices for linear mixed effects models?

Solution

Because it’s apples and oranges.

Apples: confidence intervals for means.

Oranges: tests of differences of means.

Means, and differences of means, are different statistics, and they have different standard errors and other distributional properties. In mixed models especially, they can be radically different because some sources of variation may cancel out when you take differences.

Don’t try to use confidence intervals to do comparisons. It’s like trying to make chicken soup out of hamburger.