Search code examples
rglmemmeans

Why are the standard errors of these emmeans contrasts 100x lower than those of the emmeans themselves?


I'm getting results from a glm model that I can't understand. In this model, Bond is a factor with three levels. The emmeans of the three levels have large standard errors and wide confidence intervals. There doesn't seem to be any significant distinction between the levels from this perspective.

However, when I look at some pairwise comparisons, the standard errors are 100x smaller, and two comparisons (A10 - A15 and A14 - A15) are actually significant at the 95% confidence level. Is there any property of the data or the model that can explain this?

Apologies for my lack of statistical knowledge and inability to produce a reproducible example, as the data set has over 1,000 rows. Many thanks in advance.

> emmeans(Model1, "Bond")

 Bonder emmean   SE  df asymp.LCL asymp.UCL
 A10     -6.75 21.6 Inf     -49.1      35.6
 A14     -6.48 21.6 Inf     -48.8      35.9
 A15     -6.12 21.6 Inf     -48.5      36.2

Results are averaged over the levels of: Spin, Type 
Results are given on the logit (not the response) scale. 
Confidence level used: 0.95 

> pairs(emmeans(Model1, "Bond"))

 contrast  estimate     SE  df z.ratio p.value
 A10 - A14   -0.271 0.2579 Inf -1.049  0.5456 
 A10 - A15   -0.636 0.2591 Inf -2.453  0.0376 
 A14 - A15   -0.365 0.0674 Inf -5.416  <.0001 

Results are averaged over the levels of: Spin, Type 
Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates ```

Solution

  • Hard to know for sure, but the most obvious reason would be that there's a strong positive correlation between the estimates. In general the variance of (A-B) is Var(A)+Var(B)-2*Cov(A,B), so (for example) if A, B, C all had variances (and therefore std errors) of 1 and correlations (== covariances in this case) of 0.8 between all pairs; then the variance of any pairwise difference is is 1+1-2*0.8=0.4, so the std dev would be 0.63.However, reducing the standard errors by two orders of magnitude would require a very strong correlation.

    A related possibility is that this is a case of complete separation (which would lead to large coefficients and even larger standard errors, although I would typically expect these to be even larger (e.g. |beta|>8, SD>5*beta)

    What is cov2cor(vcov(Model)) ?