Search code examples
rmgcvconvergenceautocorrelation

How to correct false convergence mgcv::gamm after updating R


I am getting a false convergence on a GAMM using mgcv::gamm when I try to fit models with corARMA functions for correlated error structures. The response data are amounts of precipitation, and are Gamma-distributed. The predictors are year and day-of-year ('julian'). I am fitting the model to the subset of observations for which precipitation occurred (i.e., I'm modelling only non-0 precipitation days). You will see there are a lot of NAs in the data, this is because I'm only interested in precipitation patterns of the summer season. I was advised by a colleague to pad the other dates with NAs instead of eliminating them (I was told the model might try to connect the end of august to the beginning of may in a cyclical pattern if I omitted the Sept-Apr dates).

The correlated error structures are meant to describe temporal autocorrelation, following the advice/process outlined in Gavin Simpson's excellent blog post . This code fit the model easily when I was working with R version 3.6.1, but after updating to 4.1.2 I am getting false convergence errors and "coefficient matrix not invertible" errors. I've tried changing some of the controls (mainly number of iterations) and that doesn't seem to help. I'm unsure of how to adjust controls for model convergence in this case (and of why an R update would cause this problem), and I don't know anything about the "coefficient matrix not invertible" error. Any advice or thoughts are much appreciated.

Link to download the data (too many rows to directly post)

Modelling without the autocorrelated error structure works fine:

require("mgcv")
cv.data <- read.csv("cv.data.csv", header = T)
#model precipitation amount for days on which it did rain:
gam.precip.non0<-gamm(PRCP ~ s(julian) + s(year, k = 23), data =  cv.data[cv.data$prcp_binary!=0,], family = Gamma, method = "REML")
   summary(gam.precip.non0$gam)

Fitting a couple of autocorrelation functions to compare:

    ctrl<-list(niterEM = 0, optimMethod = "L-BFGS-B", maxIter = 100, msMaxIter = 100)
gam.non01<-gamm(PRCP ~ s(julian) + s(year, k = 23), 
                data = cv.data[cv.data$prcp_binary!=0,], 
                family = Gamma, method = "REML", control = ctrl, 
                correlation = corARMA(form = ~1|year, p = 1))
#false convergence

gam.non02<-gamm(PRCP ~ s(julian) + s(year, k = 23), 
                data = cv.data[cv.data$prcp_binary!=0,], 
                family = Gamma, method = "REML", control = ctrl, 
                correlation = corARMA(form = ~1|year, p = 2))
#other error: "coefficient matrix not invertible"

gam.non03<-gamm(PRCP ~ s(julian) + s(year, k = 23), 
                data = cv.data[cv.data$prcp_binary!=0,], 
                family = Gamma, method = "REML", control = ctrl, 
                correlation = corARMA(form = ~1|year, p = 3))
#other error: "coefficient matrix not invertible"

Solution

  • I am not sure if this is what you want, but I suspect that the problem occurs because of the structure of year that has repeated values and gaps.

    It works (technically) if we use a continuous time variable:

    require("mgcv")
    cv.data <- read.csv("cv.data.csv", header = T)
    #model precipitation amount for days on which it did rain:
    gam.precip.non0<-gamm(PRCP ~ s(julian) + s(year, k = 23), data =  cv.data[cv.data$prcp_binary!=0,], family = Gamma, method = "REML")
    summary(gam.precip.non0$gam)
    
    plot(PRCP ~ X, data=cv.data)
    
    ## check if X is time, approximately
    #cv.data$time <- cv.data$year * 365 + cv.data$julian
    #plot(PRCP ~  time, data=cv.data)
    
    ## add msVerbose to show trace
    ctrl <- list(niterEM = 0, optimMethod = "L-BFGS-B", maxIter = 100, msMaxIter = 100, msVerbose=TRUE)
    
    ## use time or X for autocorelation
    gam.non01 <- gamm(PRCP ~ s(julian) + s(year, k = 23), 
                    data = cv.data[cv.data$prcp_binary!=0,], 
                    family = Gamma, method = "REML", control = ctrl, 
                    correlation = corARMA(form = ~1|X, p = 1))
    
    summary(gam.non01$gam)
    vis.gam(gam.non01$gam)
    

    I suspect that this is only halfway, but a careful rethinking of the intended correlation structure may help.