Search code examples
rlme4convergencerandom-effects

Model convergence warning with negative binomial glmer


I've read several posts about having convergence issues with glmer and I have tried a couple recommended work arounds (changing optimizer, changing model iterations, etc.) but nothing seems to resolve my convergence issue. I was wondering if someone could help me figure out what I am doing wrong?

Data description: The data is count data of Monarch butterfly stages throughout several years at different sites within the year. The data has over 6000 observations, but the number of observations and sites vary across years. The data has several zeros (no Monarch observed) and the data is over dispersed.

Count: Response variable of how many individuals were counted

Year: Factor with 6 levels

Stage: Factor with 7 levels

Site: Random categorical variable. Listed as a random variable because each site is a replicate within the year

Data summary:

       X            siteID                             site        year             date            stage    
 Min.   : 977   Min.   :2538   Groesbeck Field           : 581   2015: 817   7/22/2016:  91   adult    :976  
 1st Qu.:2696   1st Qu.:2541   Rain Garden               : 490   2016:1258   7/1/2016 :  77   no_eggs  :927  
 Median :4428   Median :2546   Preschool Butterfly Garden: 435   2017: 937   7/29/2016:  77   no_fifth :927  
 Mean   :4419   Mean   :2971   Spring Pond               : 427   2018:1710   8/12/2016:  77   no_first :927  
 3rd Qu.:6144   3rd Qu.:2892   Whitetail field           : 406   2019:1116   8/25/2016:  77   no_fourth:927  
 Max.   :7808   Max.   :6411   Pollinator Mound          : 387   2020: 700   (Other)  :6090   no_second:927  
                               (Other)                   :3812               NA's     :  49   no_third :927  
     count        
 Min.   : 0.0000  
 1st Qu.: 0.0000  
 Median : 0.0000  
 Mean   : 0.6422  
 3rd Qu.: 0.0000  
 Max.   :66.0000



head(monarch_long, n=20)
      X siteID                site year      date     stage count
1  1505   2541     Groesbeck Field 2018 8/21/2018   no_eggs    66
2  1748   2541     Groesbeck Field 2019 8/12/2019   no_eggs    53
3  1365   2543         Rain Garden 2017  8/7/2017   no_eggs    51
4  1591   2543         Rain Garden 2018 8/21/2018   no_eggs    49
5  1504   2541     Groesbeck Field 2018 8/15/2018   no_eggs    47
6  1469   2546    Butterfly Garden 2018  8/9/2018   no_eggs    46
7   981   2561    Barred Owl Trail 2015  8/5/2015   no_eggs    45
8  2447   2546    Butterfly Garden 2018 8/24/2018  no_first    44
9  3424   2546    Butterfly Garden 2018 8/31/2018 no_second    40
10 1503   2541     Groesbeck Field 2018  8/7/2018   no_eggs    38
11 3423   2546    Butterfly Garden 2018 8/24/2018 no_second    38
12 1021   2541     Groesbeck Field 2015 8/20/2015   no_eggs    37
13 4400   2546    Butterfly Garden 2018 8/31/2018  no_third    33
14 1265   2538 Wetland Restoration 2016 8/25/2016   no_eggs    32
15 1749   2541     Groesbeck Field 2019 8/23/2019   no_eggs    32
16 1470   2546    Butterfly Garden 2018 8/17/2018   no_eggs    31
17 1471   2546    Butterfly Garden 2018 8/24/2018   no_eggs    30
18 1034   2559 Parking Lot Islands 2015 8/11/2015   no_eggs    28
19 1588   2543         Rain Garden 2018  8/2/2018   no_eggs    27
20 6275   2892         Vernal Pool 2017 9/20/2017  no_fifth    27```

Code:

glmer.nb(count~year+stage+(1|site),  
         control=glmerControl(optimizer="bobyqa",
         optCtrl=list(maxfun=2e5)),
                   data=monarch_long)

Warning message: Warning message: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.0123119 (tol = 0.002, component 1)

Please let me know if I need to provide more details. Any advice would be greatly appreciated!


Solution

  • tl;dr I think your fit is actually fine. For negative binomial GLMMs I have now taken to recommending glmmTMB rather than lme4::glmer.nb. In any case, it works for checking the model parameters with a completely different implementation/algorithm for the model and making sure the answers are the same, which is the gold standard for addressing convergence warnings ...

    library(lme4)
    system.time(m1 <- glmer.nb(count~year+stage+(1|site),  
                   control=glmerControl(optimizer="bobyqa",
                                        optCtrl=list(maxfun=2e5)),
                   data=monarch_long))
    

    Refit with glmmTMB (using family="nbinom2")

    library(glmmTMB)
    system.time(m2 <- glmmTMB(count~year+stage+(1|site),  
                  family="nbinom2",
                  data=monarch_long))
    ## if you have multiple cores you can speed things up further ...
    system.time(m3 <- glmmTMB(count~year+stage+(1|site),  
                              family="nbinom2",
                              control=glmmTMBControl(parallel=5),
                              data=monarch_long)) ## 10 seconds
    

    Comparing the log-likelihood (with logLik()) shows that glmmTMB actually does slightly (0.2 log-likelihood units) better, but in any case the coefficient estimates are very similar (graphical comparison of fixed effects only, but the variance estimates also differ by <1%):

    library(ggplot2)
    library(dotwhisker)
    library(broom.mixed) 
    dwplot(list(glmer.nb=m1,glmmTMB=m2),effect="fixed") + theme_bw() +
        geom_vline(xintercept=0,lty=2)
    

    enter image description here

    data setup

    monarch_long <- read.csv("MLMP_2020\ long\ data\ -\ Sheet1.csv")
    monarch_long$year <- factor(monarch_long$year)
    monarch_long$stage <- factor(monarch_long$stage,
                                 levels=c(paste0("no_",
                  c("eggs","first","second","third","fourth","fifth")),"adult"))