Search code examples
rlinear-regressionemmeanslmertest

Emmeans is reporting different estimates and CIs for marginal means if printed as data.frame


After fitting a LMM I am using the emmeans() function to extract the estimated marginal means, SE and Confidence Intervals. However, depending if I directly extract the means, or save the as a data frame the estimates, their SE and their Confidence Intervals differ. Any insight would be appreciated.

Example (was not able to use dput and provide raw data due to character limit):

> summary(model)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: asin(sqrt(r_index)) ~ year + prov_season + factor_month + group + prov_season * year * group + (1 | individual)

Extraction of emmeans directly:

mm <- emmeans(model, pairwise ~ prov_season*year | group, at = list(year = c(1:8))) # extract estimates, sems and and CIs

> print(mm$emmeans)
group = naive:
 prov_season       year emmean     SE  df lower.CL upper.CL
 in                   1 0.0112 0.1587 309  -0.3011    0.324
 off                  1 0.0872 0.1768 378  -0.2604    0.435
 in                   2 0.0229 0.1437 253  -0.2600    0.306
 off                  2 0.1186 0.1577 313  -0.1916    0.429
 in                   3 0.0345 0.1305 203  -0.2228    0.292
 off                  3 0.1500 0.1405 247  -0.1268    0.427
 in                   4 0.0461 0.1199 162  -0.1906    0.283
 off                  4 0.1814 0.1261 189  -0.0674    0.430
 in                   5 0.0577 0.1125 136  -0.1647    0.280
 off                  5 0.2128 0.1155 148  -0.0155    0.441
 in                   6 0.0693 0.1090 125  -0.1465    0.285
 off                  6 0.2442 0.1098 128   0.0268    0.462
 in                   7 0.0810 0.1099 129  -0.1364    0.298
 off                  7 0.2756 0.1098 129   0.0584    0.493
 in                   8 0.0926 0.1149 149  -0.1345    0.320
 off                  8 0.3070 0.1154 151   0.0790    0.535

group = provisioned:
 prov_season       year emmean     SE  df lower.CL upper.CL
 in                   1 0.4076 0.0924 314   0.2258    0.589
 off                  1 0.2519 0.1043 413   0.0469    0.457
 in                   2 0.4422 0.0907 307   0.2638    0.621
 off                  2 0.2528 0.1000 381   0.0561    0.449
 in                   3 0.4768 0.0899 305   0.2999    0.654
 off                  3 0.2538 0.0970 355   0.0630    0.444
 in                   4 0.5114 0.0902 308   0.3339    0.689
 off                  4 0.2547 0.0952 337   0.0674    0.442
 in                   5 0.5461 0.0915 315   0.3659    0.726
 off                  5 0.2557 0.0949 329   0.0690    0.442
 in                   6 0.5807 0.0938 325   0.3961    0.765
 off                  6 0.2566 0.0959 331   0.0680    0.445
 in                   7 0.6153 0.0970 339   0.4245    0.806
 off                  7 0.2576 0.0983 342   0.0643    0.451
 in                   8 0.6499 0.1010 355   0.4512    0.849
 off                  8 0.2585 0.1019 361   0.0581    0.459

Results are averaged over the levels of: factor_month 
Degrees-of-freedom method: kenward-roger 
Results are given on the asin(sqrt(mu)) (not the response) scale. 
Confidence level used: 0.95 

Extraction of emmeans as.data.frame():

> as.data.frame(mm)
 group       prov_season       year contrast                             emmean      SE  df lower.CL upper.CL
 naive       in          1          .                                  0.011232 0.15872 309  -0.5897  0.61217
 naive       off         1          .                                  0.087219 0.17677 378  -0.5806  0.75500
 naive       in          2          .                                  0.022854 0.14365 253  -0.5225  0.56821
 naive       off         2          .                                  0.118613 0.15767 313  -0.4783  0.71550
 naive       in          3          .                                  0.034476 0.13049 203  -0.4628  0.53172
 naive       off         3          .                                  0.150007 0.14053 247  -0.3837  0.68374
 naive       in          4          .                                  0.046098 0.11986 162  -0.4128  0.50498
 naive       off         4          .                                  0.181401 0.12615 189  -0.2999  0.66275
 naive       in          5          .                                  0.057720 0.11249 136  -0.3749  0.49036
 naive       off         5          .                                  0.212795 0.11555 148  -0.2306  0.65616
 naive       in          6          .                                  0.069342 0.10904 125  -0.3511  0.48977
 naive       off         6          .                                  0.244189 0.10984 128  -0.1790  0.66738
 naive       in          7          .                                  0.080964 0.10988 129  -0.3423  0.50419
 naive       off         7          .                                  0.275583 0.10979 129  -0.1473  0.69850
 naive       in          8          .                                  0.092586 0.11491 149  -0.3483  0.53345
 naive       off         8          .                                  0.306977 0.11541 151  -0.1356  0.74957
 provisioned in          1          .                                  0.407628 0.09240 314   0.0578  0.75742
 provisioned off         1          .                                  0.251854 0.10425 413  -0.1416  0.64535
 provisioned in          2          .                                  0.442235 0.09067 307   0.0989  0.78555
 provisioned off         2          .                                  0.252805 0.10002 381  -0.1250  0.63063
 provisioned in          3          .                                  0.476842 0.08994 305   0.1363  0.81742
 provisioned off         3          .                                  0.253756 0.09698 355  -0.1128  0.62035
 provisioned in          4          .                                  0.511450 0.09023 308   0.1698  0.85310
 provisioned off         4          .                                  0.254708 0.09524 337  -0.1055  0.61493
 provisioned in          5          .                                  0.546057 0.09154 315   0.1995  0.89257
 provisioned off         5          .                                  0.255659 0.09488 329  -0.1033  0.61460
 provisioned in          6          .                                  0.580665 0.09382 325   0.2257  0.93566
 provisioned off         6          .                                  0.256610 0.09590 331  -0.1062  0.61941
 provisioned in          7          .                                  0.615272 0.09700 339   0.2484  0.98214
 provisioned off         7          .                                  0.257561 0.09827 342  -0.1141  0.62920
 provisioned in          8          .                                  0.649879 0.10101 355   0.2681  1.03170
 provisioned off         8          .                                  0.258513 0.10190 361  -0.1266  0.64363

Solution

  • as.data.frame is making a Bonferroni correction to the confidence intervals by default based on both the contrasts and the means in the table.

    You can use change this behaviour using eg adjust="none" as an argument.

    There is more detail on this behaviour (with respect to p-value adjustment) in answers to this question.

    Why is converting emmeans contrasts to a data.frame not reporting correct p-values?

    It can be difficult to predict what adjustment to p-values and confidence intervals emmeans makes and its not always obvious from the documentation so its usually better to control it explicitly.

    By the way, even though you couldn't paste your full data it would have been easy enough to make a small reproducible dataset to demonstrate your problem. emmeans with any linear model will behave in the same way.