Search code examples
rregressiongammgcv

Random effects in GAM and one other smooth make covariance matrix non-positive definite


I am using gam in mgcv to fit the model

m <- gam(y ~ s(x) + s(Group, bs = "re"))

however this makes the covariance-matrix

vcov(m)

non-positive definite. Further, the estimated s(x) is just a straight line.

Now, removing the s(x) part fixes the vcov problem, and removing the s(Group) also fixes these problems and then the estimated curve s(x) is not a straight line.

Does anybody know why this happens, and how to fix it? Such that I can include both s(x) and s(Group), and also get a positive-definite vcov matrix, and properly estimated curves (even if they were to be insignificant)?

I believe it is probably caused by the fact that x and Group are confounded to a degree, such that only one of them is needed?


Solution

  • This means that the penalized least squares problem is not full-rank, even after regularization by the ordinary quadratic penalty. Check your coefficients first. If any coefficients are exactly 0, then they are not identifiable and constrained at 0.

    any(coef(m) == 0)
    

    Note that unlike in lm and glm, where unidentifiable coefficients are coded NA, mgcv just uses 0. Of course, the resulting standard error of these coefficients are also 0, this is why the covariance matrix given by vcov is not positive definite.

    The fix is to use a stronger regularization. Try shrinkage smooth here, for example,

    s(x, bs = 'cs', k = 10)
    s(x, bs = 'ts', k = 10)
    

    See ?smooth.terms for all smoothing basis classes, including the shrinkage class.


    Another thing to try is to make your model less flexible, by reducing the rank of s(x), for example

    s(x, k = 5)
    s(x, k = 3)
    

    The minimal k is 3 for cubic spline.

    But this is not guaranteed to fix your problem, as if it is the NULL space of the s(x) that is confounded with s(Group, bs = 're'), you have to use shrinkage smooth to penalize the NULL space.