Search code examples
rmle

Error: "initial value in 'vmmin' is not finite" not in mle2() but in confint()


I know the web is plastered with questions (and answers) about the 'initial value in vmmim is not finite' error when trying to fit parameters for an mle2 object. I do not have this error when creating my mle2 object, but I DO get this error when trying to find the 95% CI for a parameter from an mle2 object.

Here is a reproducible example:

Here are the data:

d = structure(list(SST_1YR = c(11.6, 11.7, 11.9, 12, 12.1, 12.2, 
12.3, 12.4, 12.5, 12.6, 12.7, 12.8, 12.9, 13, 13.1, 13.2, 13.3, 
13.4, 13.5, 13.6, 13.7, 13.8, 13.9, 14, 14.2, 14.3, 14.4, 14.5, 
14.6, 14.7, 14.8, 14.9, 15, 15.1, 15.2, 15.3, 15.4, 15.5, 15.6, 
15.7, 15.8, 15.9, 16, 16.2, 16.3, 16.5, 16.6, 16.7, 16.9, 17, 
17.1, 17.2, 17.3, 17.4, 17.5, 17.6, 17.7, 17.8, 17.9), DML = structure(c(84.5, 
71, 114.75, 90.9473684210526, 31.7631578947368, 92.5, 80.4, 98.7021276595745, 
70.8, 66.8382352941177, 70.2553191489362, 98.1111111111111, 86.5241379310345, 
59.7209302325581, 38.7692307692308, 78.2028985507246, 86.3503649635037, 
69.1161290322581, 61.9122807017544, 60.1212121212121, 98.5490196078431, 
94.3145161290323, 76.5643564356436, 39.4230769230769, 98.42, 
95.6129032258064, 65.9673202614379, 39, 64.0576923076923, 42.4166666666667, 
59.6989247311828, 62.8039215686275, 74.5263157894737, 50.8888888888889, 
64.35, 40.5, 53.7466666666667, 42, 49.5, 23.8888888888889, 39.6170212765957, 
74.8947368421053, 42.8518518518519, 40.0344827586207, 53, 39.3333333333333, 
24.1333333333333, 30, 39.4880952380952, 94.4883720930233, 69.1428571428571, 
33.7179487179487, 26.1538461538462, 37.8965517241379, 38.4117647058824, 
44.2727272727273, 68.3157894736842, 37.3, 43.4444444444444), .Dim = 59L, .Dimnames = list(
    c("11.6", "11.7", "11.9", "12", "12.1", "12.2", "12.3", "12.4", 
    "12.5", "12.6", "12.7", "12.8", "12.9", "13", "13.1", "13.2", 
    "13.3", "13.4", "13.5", "13.6", "13.7", "13.8", "13.9", "14", 
    "14.2", "14.3", "14.4", "14.5", "14.6", "14.7", "14.8", "14.9", 
    "15", "15.1", "15.2", "15.3", "15.4", "15.5", "15.6", "15.7", 
    "15.8", "15.9", "16", "16.2", "16.3", "16.5", "16.6", "16.7", 
    "16.9", "17", "17.1", "17.2", "17.3", "17.4", "17.5", "17.6", 
    "17.7", "17.8", "17.9")))), .Names = c("SST_1YR", 
"DML"), row.names = c(NA, -59L), class = "data.frame")

Here is the creation of the mle2 object (with no warnings...)

m = mle2(DML~dgamma(scale=(a+b*SST_1YR)/sh, shape=sh), start=list(a=170, b=-7.4, sh=10), data=d)

And here is where I get an NA and my vmmin warning for the lower bound of parameter b:

confint(m)

I have tried changing the starting values but nothing I have tried has helped. I have created other models with the same data but a different distribution and no error. Can anyone help me figure out what is causing this error?

Using package bbmle-1.0.17


Solution

  • There are a few things to try here. First look at the data (always a good idea):

    library("ggplot2"); theme_set(theme_bw())
    ggplot(d,aes(SST_1YR,DML)) + geom_point()+
        geom_smooth(method="glm",family=Gamma(link="identity"))+
            geom_smooth(method="lm",colour="red",fill="red")
    

    Note that in this case the Gamma regression looks almost identical to a regular linear regression (i.e. the shape parameter is large). Also, the distribution of the x values is far from the origin -- this may lead to numeric problems.

    library("bbmle")
    m <- mle2(DML~dgamma(scale=(a+b*SST_1YR)/sh, shape=sh),
              start=list(a=170, b=-7.4, sh=10), data=d)
    confint(m)
    

    Confirms the problem:

    ##        2.5 %     97.5 %
    ## a  132.05952 203.192159
    ## b         NA  -4.407289
    ## sh   6.83566  13.933383
    

    I thought that setting parscale could help, but it appears to make the problem worse rather than better:

    m2 <- update(m,control=list(parscale=c(a=170,b=8,sh=10)))
    confint(m2)
    ##       2.5 %     97.5 %
    ## a        NA 203.153230
    ## b        NA  -4.407281
    ## sh 6.835659  13.933383
    

    Does centering the predictor variable help? scale(x,scale=FALSE) centers but doesn't scale x ... (using SST_1YR-mean(SST_1YR) might be clearer, that way we wouldn't have three scales floating around in the expression ...

    m3 <- mle2(DML~dgamma(scale=(a+b*scale(SST_1YR,scale=FALSE))/sh, shape=sh),
              start=list(a=170, b=-7.4, sh=10), data=d)
    
    confint(m3)
    ##       2.5 %    97.5 %
    ## a  56.462610 66.754118
    ## b  -9.421521 -4.407262
    ## sh  6.835662 13.933384
    

    Looks good, although it would be a little tricky to get the intercept terms back to the original scale (although we could just take them from the previous, uncentered fit).

    It turns out you can also fit this model via

    glm(DML~SST_1YR,family=Gamma(link="identity"),data=d)
    

    although confint() again fails rather mysteriously (Error in y/mu: non-conformable arrays).

    Some other things that I tried that didn't work particularly well (included here only for completeness):

    1. try to prevent linear regression from going negative:
    mle2(DML~dgamma(scale=pmin((a+b*SST_1YR)/sh,1e-5),
                          shape=sh),
              start=list(a=170, b=-7.4, sh=10), data=d)
    
    1. use a penalized form of dgamma to return bad likelihoods rather than NA when x<0:
    dgamma_pen <- function(x,...,log=FALSE) {
       r <- if (x<0) (-100) else dgamma(x,...,log=TRUE)
       if (log) r else exp(r)
    }
    
    m4 <- mle2(DML~dgamma_pen(scale=pmin((a+b*SST_1YR)/sh,1e-5),
                        shape=sh),
             start=list(a=170, b=-7.4, sh=10), data=d)