Search code examples
rdata-modelingglmpoissonglmm

How to fix 'Error in vars$family.glmm$checkData(y) : data must be nonnegative integers'?


I am trying to run a glmm for the first time with my data. I have population data across 13 study sites and I am using a test file to see the results for blesbok in South Africa. This is my code (totally made up)

library(glmm)
glmm1<-glmm(Number~Location, 
           random= list(~0+Nitrogen,~0+Dist_water), 
           varcomps.names=c("Nit","Dist"), 
           data = bles, m=100, 
           family.glmm = poisson.glmm)

Where,

Number <- c(25,16,16,13,12,9,15,5,4,5,1,259,224,259,588,604,483,576,599,664)
Location <- c("Borakolalo","Borakolalo","Borakolalo","Borakolalo","Bloemhof","Bloemhof","Bloemhof",   
               "Bloemhof","Boskop","Boskop","Boskop","Boskop","Kgaswane","Kgaswane","Kgaswane",   
               "Kgaswane","Mafikeng","Mafikeng","Mafikeng","Mafikeng")
Nitrogen<-c(1.0889,1.1406,0.9835,1.0737,1.0578,1.0806,0.9914,0.9630,1.1718,0.8955,1.0211,0.9489,
            0.9808,1.0053,0.9682,0.9794,1.0959,1.0028,0.9281,0.9887)
Dist_water<- c(2156.0,3783.8,3285.8,2574.7,2242.3,1729.5,1018.1,1174.9,869.0,563.0,257.1,660.4,
               840.4,717.7,762.6,528.5,626.5,691.2,635.9,606.5)
bles<-data.frame(Number,Location,Nitrogen,Dist_water)

I keep getting this error "Error in vars$family.glmm$checkData(y) : data must be nonnegative integers."

I don't understand how to fix it.

I also get these errors if I try to change anything in my random effects or Location. "Error in uniroot(fred, c(beta.dn, beta.up)) : invalid 'xmax' value"

OR "Error in trust(fn.inner.trust, parinit = c(beta, s), rinit = 5, rmax = 10000, : parinit not feasible"

OR Error in chol.default(thatthing) : the leading minor of order 2 is not positive definite

can someone please explain me these errors? I don't understand how to fix them. I would really appreciate the help.

I added "0" in front of Location glmm1<-glmm(Number~0+Location, random= list(~0+Nitrogen,~0+Dist_water), varcomps.names=c("Nit","Dist"), data = bles, m=100, your text family.glmm = poisson.glmm)

changed the Nitrogen value to whole numbers and got the error

Error in uniroot(fred, c(beta.dn, beta.up)) : invalid 'xmax' value


Solution

  • I have a bunch to say here. tl;dr I can get glmm to work, but you might be better off with a more widely used package.

    • first, you need to explicitly set Number to be an integer. This is unusual, because in most contexts R doesn't care whether a number is stored as floating-point or integer:
    bles$Number <- as.integer(bles$Number)
    

    However, I think you have your fixed effects and random effects mixed up. It would be more usual to make nitrogen and water fixed and location a random effect:

    system.time(
        glmm1 <- glmm(Number~Nitrogen+Dist_water,
                      random= list(Number~0+Location),
                      varcomps.names=c("Location"),
                      data = bles, m=1e5, 
                      family.glmm = poisson.glmm)
    )
    

    I had trouble with the summary() metric: I kept getting estimates and standard errors listed as zero. coef(glmm1) and sqrt(diag(vcov(glmm1))) get the means and standard errors.

    I tried this again with lme4::glmer, looked at the model with performance::check_model(), realized there was a lot of overdispersion, and refitted with a negative binomial model.

    library(lme4)
    ## scale & center parameters to eliminate warnings
    ##  (not 100% necessary)
    bles_sc <- transform(bles,
                      Nitrogen = drop(scale(Nitrogen)),
                      Dist_water = drop(scale(Dist_water)))
    form <- Number~Nitrogen + Dist_water + (1|Location)
    glmm2 <- glmer(form,
                  data = bles_sc,
                  family = poisson)
    glmm3 <- glmer.nb(form, data = bles_sc)
    

    The negative binomial model says that there aren't significant effects of nitrogen or water ...

    summary(glmm3)
    Generalized linear mixed model fit by maximum likelihood (Laplace
      Approximation) [glmerMod]
     Family: Negative Binomial(1.0774)  ( log )
    Formula: Number ~ Nitrogen + Dist_water + (1 | Location)
       Data: bles_sc
    
         AIC      BIC   logLik deviance df.resid 
       238.6    243.6   -114.3    228.6       15 
    
    Scaled residuals: 
        Min      1Q  Median      3Q     Max 
    -1.0120 -0.4518 -0.1304  0.3530  2.3174 
    
    Random effects:
     Groups   Name        Variance Std.Dev.
     Location (Intercept) 1.398    1.182   
    Number of obs: 20, groups:  Location, 5
    
    Fixed effects:
                Estimate Std. Error z value Pr(>|z|)    
    (Intercept)   4.3591     0.5762   7.566 3.86e-14 ***
    Nitrogen     -0.4359     0.3391  -1.286    0.199    
    Dist_water   -0.4209     0.5364  -0.785    0.433    
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    

    This agrees with the plotted data

    bles_long <- tidyr::pivot_longer(bles, -c(Number, Location), names_to = "var")
    ggplot(bles_long) +
        aes(value, Number, colour = Location) +
        geom_point() +
        geom_smooth(method = "lm") +
        scale_y_log10() +
        facet_wrap(~var, scale = "free_x")
    

    enter image description here