Search code examples
rgammgcv

plot.gam and vis.gam produce different surfaces for same mgcv GAM


I'm fitting a GAM with multiple continuous/continuous tensor interactions and try to visualize the partial effects in a 3D surface plot. Depending on what terms I include in my model, plot.gam and vis.gam produce different surfaces and I don't understand why.

df=structure(list(y = c(110.045, 107.175, 108.925, 114.18, 112.94, 
114.69, 113.87, 110.745, 107.52, 107.875, 113.255, 110.325, 108.655, 
112.79, 111.155, 109.58, 110.94, 111.01, 106.755, 110.035, 106.915, 
102.13, 103.08, 102.235, 103.34, 102.95, 100.42, 99.13, 98.14, 
94.855, 100.62, 105.24, 106.535, 98.69, 105.835, 113.12, 109.455, 
107.925, 108.42, 115.455, 116.575, 110.935, 115.115, 110.905, 
108.225, 112.03, 108.45, 109.805, 97.045, 94.36, 96.535), x = c(10, 
10, 14.5, 15, 15, 19, 23, 23, 13, 14, 14, 15, 14.5, 14.5, 17.5, 
11, 11, 8, 19.5, 19.5, 10.5, 5, 5, 2, 2, 2, 3.5, 3.5, 3.5, 4.5, 
4.5, 11, 9, 9, 18, 18, 18, 24, 19, 19, 22, 16, 16, 17.5, 23, 
23, 3, 5.5, 2, 4, 2), cov1 = c(57, 57, 61, 69, 69, 76, 78, 78, 
84, 92, 92, 95, 96, 96, 97, 98, 98, 99, 105, 105, 109, 118, 118, 
119, 120, 120, 124, 131, 131, 20, 20, 38, 41, 41, 46, 52, 52, 
64, 69, 69, 74, 89, 89, 94, 101, 101, 116, 123, 129, 136, 144
), cov2 = c(0.0476190476190476, 0.0476190476190476, 0.164556962025316, 
0.0547945205479452, 0.0547945205479452, 0.0958904109589041, 0.0352941176470588, 
0.0352941176470588, 0.0161290322580645, 0.0576923076923077, 0.0576923076923077, 
0.0277777777777778, 0.1125, 0.1125, 0.0594059405940594, 0.08, 
0.08, 0.176470588235294, 0.0555555555555556, 0.0555555555555556, 
0.0169491525423729, 0.0357142857142857, 0.0357142857142857, 0, 
0.166666666666667, 0.166666666666667, 0.4375, 0, 0, 0.0869565217391304, 
0.0869565217391304, 0.2, 0.024390243902439, 0.024390243902439, 
0.0714285714285714, 0.105769230769231, 0.105769230769231, 0.125874125874126, 
0.0375, 0.0375, 0.10752688172043, 0.0394736842105263, 0.0394736842105263, 
0.0508474576271186, 0.184397163120567, 0.184397163120567, 0.0434782608695652, 
0.0625, 0.125, 0, 0.555555555555556), blocking_var = c("C", "A", 
"A", "C", "B", "B", "B", "C", "A", "A", "B", "C", "A", "B", "C", 
"C", "B", "B", "A", "C", "C", "A", "B", "A", "B", "A", "B", "B", 
"A", "A", "C", "A", "B", "A", "C", "B", "C", "A", "A", "B", "B", 
"A", "C", "A", "A", "C", "B", "B", "A", "A", "A")), row.names = c(1L, 
3L, 5L, 7L, 9L, 11L, 13L, 15L, 17L, 19L, 21L, 23L, 25L, 27L, 
29L, 31L, 33L, 35L, 37L, 39L, 41L, 43L, 45L, 47L, 49L, 51L, 53L, 
55L, 57L, 110L, 112L, 114L, 116L, 118L, 120L, 122L, 124L, 126L, 
128L, 130L, 132L, 134L, 136L, 138L, 140L, 142L, 144L, 146L, 148L, 
150L, 152L), class = "data.frame")

When I fit my model like this and plot it:

gam<-gam(y~ s(x) + ti(x,cov1) +  ti(x,cov2) + blocking_var, data=df, method = 'REML')
par(mfrow = c(1,2))
plot(gam, select=2, scheme=1, theta=35, phi=32, col='grey80') 
vis.gam(gam, view=c('x', 'cov1'), n.grid=50, theta=35, phi=32, zlab="", too.far=0.1)

I get two different surfaces:

GAM Plot

When I fit it with just te() terms, the plots are the same. However, this is not what I need, as I am not interested in the main effects of the covariates (which would also introduce multicollinearity).

gam.2<-gam(y~ te(x,cov1) +  te(x,cov2) + blocking_var, data=df, method = 'REML')

GAM2


Solution

  • plot.gam shows partial effect plots — these are plots of the separate effects of the terms in the model.

    vis.gam() is showing you the output from the model as a whole, at values of x and cov1 while holding cov2 at some fix representative value. In other words you are seeing the fitted response y for the covariate values specified.

    The difference arises because in the vis.gam() case, you are also seeing the effect of s(x), and as you are varying x the effect on the plotted surface is not a constant everywhere.

    Use plot.gam() if you want true partial effect plots. Use vis.gam() is you want to look at how the response varies as a function of two covariates, while holding the other covariates fixed at representative (or user supplied) values.

    If you want something else, explain what you want and I can see how to generate it.