I'm fitting a GAM with multiple continuous/continuous tensor interactions and try to visualize the partial effects in a 3D surface plot. Depending on what terms I include in my model, plot.gam
and vis.gam
produce different surfaces and I don't understand why.
df=structure(list(y = c(110.045, 107.175, 108.925, 114.18, 112.94,
114.69, 113.87, 110.745, 107.52, 107.875, 113.255, 110.325, 108.655,
112.79, 111.155, 109.58, 110.94, 111.01, 106.755, 110.035, 106.915,
102.13, 103.08, 102.235, 103.34, 102.95, 100.42, 99.13, 98.14,
94.855, 100.62, 105.24, 106.535, 98.69, 105.835, 113.12, 109.455,
107.925, 108.42, 115.455, 116.575, 110.935, 115.115, 110.905,
108.225, 112.03, 108.45, 109.805, 97.045, 94.36, 96.535), x = c(10,
10, 14.5, 15, 15, 19, 23, 23, 13, 14, 14, 15, 14.5, 14.5, 17.5,
11, 11, 8, 19.5, 19.5, 10.5, 5, 5, 2, 2, 2, 3.5, 3.5, 3.5, 4.5,
4.5, 11, 9, 9, 18, 18, 18, 24, 19, 19, 22, 16, 16, 17.5, 23,
23, 3, 5.5, 2, 4, 2), cov1 = c(57, 57, 61, 69, 69, 76, 78, 78,
84, 92, 92, 95, 96, 96, 97, 98, 98, 99, 105, 105, 109, 118, 118,
119, 120, 120, 124, 131, 131, 20, 20, 38, 41, 41, 46, 52, 52,
64, 69, 69, 74, 89, 89, 94, 101, 101, 116, 123, 129, 136, 144
), cov2 = c(0.0476190476190476, 0.0476190476190476, 0.164556962025316,
0.0547945205479452, 0.0547945205479452, 0.0958904109589041, 0.0352941176470588,
0.0352941176470588, 0.0161290322580645, 0.0576923076923077, 0.0576923076923077,
0.0277777777777778, 0.1125, 0.1125, 0.0594059405940594, 0.08,
0.08, 0.176470588235294, 0.0555555555555556, 0.0555555555555556,
0.0169491525423729, 0.0357142857142857, 0.0357142857142857, 0,
0.166666666666667, 0.166666666666667, 0.4375, 0, 0, 0.0869565217391304,
0.0869565217391304, 0.2, 0.024390243902439, 0.024390243902439,
0.0714285714285714, 0.105769230769231, 0.105769230769231, 0.125874125874126,
0.0375, 0.0375, 0.10752688172043, 0.0394736842105263, 0.0394736842105263,
0.0508474576271186, 0.184397163120567, 0.184397163120567, 0.0434782608695652,
0.0625, 0.125, 0, 0.555555555555556), blocking_var = c("C", "A",
"A", "C", "B", "B", "B", "C", "A", "A", "B", "C", "A", "B", "C",
"C", "B", "B", "A", "C", "C", "A", "B", "A", "B", "A", "B", "B",
"A", "A", "C", "A", "B", "A", "C", "B", "C", "A", "A", "B", "B",
"A", "C", "A", "A", "C", "B", "B", "A", "A", "A")), row.names = c(1L,
3L, 5L, 7L, 9L, 11L, 13L, 15L, 17L, 19L, 21L, 23L, 25L, 27L,
29L, 31L, 33L, 35L, 37L, 39L, 41L, 43L, 45L, 47L, 49L, 51L, 53L,
55L, 57L, 110L, 112L, 114L, 116L, 118L, 120L, 122L, 124L, 126L,
128L, 130L, 132L, 134L, 136L, 138L, 140L, 142L, 144L, 146L, 148L,
150L, 152L), class = "data.frame")
When I fit my model like this and plot it:
gam<-gam(y~ s(x) + ti(x,cov1) + ti(x,cov2) + blocking_var, data=df, method = 'REML')
par(mfrow = c(1,2))
plot(gam, select=2, scheme=1, theta=35, phi=32, col='grey80')
vis.gam(gam, view=c('x', 'cov1'), n.grid=50, theta=35, phi=32, zlab="", too.far=0.1)
I get two different surfaces:
When I fit it with just te()
terms, the plots are the same. However, this is not what I need, as I am not interested in the main effects of the covariates (which would also introduce multicollinearity).
gam.2<-gam(y~ te(x,cov1) + te(x,cov2) + blocking_var, data=df, method = 'REML')
plot.gam
shows partial effect plots — these are plots of the separate effects of the terms in the model.
vis.gam()
is showing you the output from the model as a whole, at values of x
and cov1
while holding cov2
at some fix representative value. In other words you are seeing the fitted response y
for the covariate values specified.
The difference arises because in the vis.gam()
case, you are also seeing the effect of s(x)
, and as you are varying x
the effect on the plotted surface is not a constant everywhere.
Use plot.gam()
if you want true partial effect plots. Use vis.gam()
is you want to look at how the response varies as a function of two covariates, while holding the other covariates fixed at representative (or user supplied) values.
If you want something else, explain what you want and I can see how to generate it.