Search code examples
rgammgcv

Contrast plot for GAM using mgcv


When using the visreg package to visualise a GAM with a contrast plot, the confidence interval goes to zero at the inflection point when the graph is U-shaped:

# Load libraries
library(mgcv)
library(visreg)

# Synthetic data
df <- data.frame(a = -10:10, b = jitter((-10:10)^2, amount = 10))

# Fit GAM
res <- gam(b ~ s(a), data = df)

# Make contrast figure
visreg(res, type = "contrast")

enter image description here

This seems dodgy and doesn't happen when making a conditional plot (i.e., visreg(res, type = "conditional")), so instead I'm looking at the mgcv package to make the same plot. I can make a conditional plot using mgcv (e.g., plot.gam(res)), but I don't see the option to make a contrast plot. Is this possible with the mgcv package?


Solution

  • This is due to identifiability constraints imposed on the spline basis/bases used in the model. This is a sum-to-zero constraint and effectively removes an intercept-like basis function from the basis used for each smooth term so that these are not confounded with the model intercept. This allows the model to be identifiable, rather than having an infinity of solutions.

    Using standard theory, the confidence interval has to tend to zero where it crosses zero on the y-axis (the centred effect usually, but here as shown it is on on some transformed scale) as the constraint implies that at some point x, the effect is 0 and has 0 variance.

    This is nonsense of course and recent research has investigated this problem. One solution provided by Simon Wood and colleagues employs extensions to Nychka's observation that, for the Gaussian case, the Bayesian credible interval for a smooth has good across-the-function interpretation as a frequentist confidence interval (so not pointwise, but not simultaneous either). Nychka's results (the coverage properties of the interval) fail in situations where the estimated smooth has squared bias that is not substantially less than the variance of the estimate; clearly this fails to be the case when the variance hits zero where the estimated smooth passes through zero effect as the bias is not actually quite zero at this point.

    Marra and Wood (2012) have extended these results to the generalized model setting, basically estimating the confidence interval for one smooth by assuming that all the other terms in the model have had the identifiability constraints applied to them, but not the smooth of interest. This shifts the focus of inference from the smooth directly to the smooth + intercept. You can turn this on in plot.gam() with the argument seWithMean = TRUE.

    I don't see an easy way to make visreg do this however, although it is trivial to get back the information you want via predict.gam() with the options type = 'iterms', se.fit = TRUE. This returns, on the scale of the linear predictor, the contributions of each model smooth term plus the standard error that include the correction implied by seWithMean. You can then fiddle with this to your heart's content; adding on the model constant term (the estimate for the intercept) for example should provide you something close to the figure you show in your question.