Search code examples
rmixed-modelsmarginal-effectsglmmtmb

Computing marginal effects: Why does ggeffect and ggemmeans give difference answers?


Example

library(glmmTMB)
library(ggeffects)

## Zero-inflated negative binomial model
(m <- glmmTMB(count ~ spp + mined + (1|site),
              ziformula=~spp + mined, 
              family=nbinom2, 
              data=Salamanders, 
              na.action = "na.fail"))
summary(m)

ggemmeans(m, terms="spp")
spp   | Predicted |       95% CI
--------------------------------
GP    |      1.11 | [0.66, 1.86]
PR    |      0.42 | [0.11, 1.59]
DM    |      1.32 | [0.81, 2.13]
EC-A  |      0.75 | [0.37, 1.53]
EC-L  |      1.81 | [1.09, 3.00]
DES-L |      2.00 | [1.25, 3.21]
DF    |      0.99 | [0.61, 1.62]

ggeffects::ggeffect(m, terms="spp")
spp   | Predicted |       95% CI
--------------------------------
GP    |      1.14 | [0.69, 1.90]
PR    |      0.44 | [0.12, 1.63]
DM    |      1.36 | [0.85, 2.18]
EC-A  |      0.78 | [0.39, 1.57]
EC-L  |      1.87 | [1.13, 3.07]
DES-L |      2.06 | [1.30, 3.28]
DF    |      1.02 | [0.63, 1.65]

Questions

Why are ggeffect and ggemmeans giving different results for the marginal effects? Is it simply something internal with how the packages emmeans and effects are computing them? Also, does anyone know of some resources on how to compute marginal effects from scratch for a model like that in the example?


Solution

  • You fit a complex model: zero-inflated negative binomial model with random effects.

    What you observe has little to do with the model specification. Let's show this by fitting a simpler model: Poisson with fixed effects only.

    library("glmmTMB")
    library("ggeffects")
    
    m <- glmmTMB(
      count ~ spp + mined,
      family = poisson,
      data = Salamanders
    )
    
    ggemmeans(m, terms = "spp")
    #> # Predicted counts of count
    #> 
    #> spp   | Predicted |       95% CI
    #> --------------------------------
    #> GP    |      0.73 | [0.59, 0.89]
    #> PR    |      0.18 | [0.12, 0.27]
    #> DM    |      0.91 | [0.76, 1.10]
    #> EC-A  |      0.34 | [0.25, 0.45]
    #> EC-L  |      1.35 | [1.15, 1.59]
    #> DES-L |      1.43 | [1.22, 1.68]
    #> DF    |      0.79 | [0.64, 0.96]
    
    ggeffect(m, terms = "spp")
    #> # Predicted counts of count
    #> 
    #> spp   | Predicted |       95% CI
    #> --------------------------------
    #> GP    |      0.76 | [0.62, 0.93]
    #> PR    |      0.19 | [0.13, 0.28]
    #> DM    |      0.96 | [0.79, 1.15]
    #> EC-A  |      0.35 | [0.26, 0.47]
    #> EC-L  |      1.41 | [1.20, 1.66]
    #> DES-L |      1.50 | [1.28, 1.75]
    #> DF    |      0.82 | [0.67, 1.00]
    

    The documentation explains that internally ggemmeans() calls emmeans::emmeans() while ggeffect() calls effects::Effect().

    Both emmeans and effects compute marginal effects but they make a different (default) choice how to marginalize out (ie. average over) mined in order to get the effect of spp.

    mined is a categorical variable with two levels: "yes" and "no". The crucial bit is that the two levels are not balanced: there are slightly more "no"s than "yes"s.

    xtabs(~ mined + spp, data = Salamanders)
    #>      spp
    #> mined GP PR DM EC-A EC-L DES-L DF
    #>   yes 44 44 44   44   44    44 44
    #>   no  48 48 48   48   48    48 48
    

    Intuitively, this means that the weighted average over mined [think of (44 × yes + 48 × no) / 92] is not the same as the simple average over mined [think of (yes + no) / 2].

    Let's check the intuition by specifying how to marginalize out mined when we call emmeans::emmeans() directly.

    # mean (default)
    emmeans::emmeans(m, "spp", type = "response", weights = "equal")
    #>  spp    rate     SE  df lower.CL upper.CL
    #>  GP    0.726 0.0767 636    0.590    0.893
    #>  PR    0.181 0.0358 636    0.123    0.267
    #>  DM    0.914 0.0879 636    0.757    1.104
    #>  EC-A  0.336 0.0497 636    0.251    0.449
    #>  EC-L  1.351 0.1120 636    1.148    1.590
    #>  DES-L 1.432 0.1163 636    1.221    1.679
    #>  DF    0.786 0.0804 636    0.643    0.961
    #> 
    #> Results are averaged over the levels of: mined 
    #> Confidence level used: 0.95 
    #> Intervals are back-transformed from the log scale
    
    # weighted mean
    emmeans::emmeans(m, "spp", type = "response", weights = "proportional")
    #>  spp    rate     SE  df lower.CL upper.CL
    #>  GP    0.759 0.0794 636    0.618    0.932
    #>  PR    0.190 0.0373 636    0.129    0.279
    #>  DM    0.955 0.0909 636    0.793    1.152
    #>  EC-A  0.351 0.0517 636    0.263    0.469
    #>  EC-L  1.412 0.1153 636    1.203    1.658
    #>  DES-L 1.496 0.1196 636    1.279    1.751
    #>  DF    0.822 0.0832 636    0.674    1.003
    #> 
    #> Results are averaged over the levels of: mined 
    #> Confidence level used: 0.95 
    #> Intervals are back-transformed from the log scale
    

    The second option returns the marginal effects computed with ggeffects::ggeffect.

    Update

    @Daniel points out that ggeffects accepts the weights argument and will pass it to emmeans. This way you can keep using ggeffects and still control how predictions are averaged to compute marginal effects.

    Try it out for yourself with:

    ggemmeans(m, terms="spp", weights = "proportional")
    ggemmeans(m, terms="spp", weights = "equal")