Working in R. I'm having trouble with calculating my predicted values to the response scale when I have to exclude a random effect from the prediction. By excluding the random effect from the prediction, I need to specify type = "terms"
, hereby making it impossible to include the type = "response"
argument. Is there a way of recalculating the predicted values to the response scale (beta regression)? Or is it possible to both specify the exclusion of Area
and type = "response"
in the predict
function? Please see my code below.
str(data_re)
# 'data.frame': 35 obs. of 17 variables:
# $ ProportionBirdsScavenging: num 0.6619 0.4062 0.6943 0.0143 0.0143 ...
# $ OverheadCover : num 0.7 0.671 0.679 0.79 0.62 ...
# $ Area : Factor w/ 6 levels "Hamert","KempenBroek",..: 3 1 1 1 1 1 1 1 1 2 ...
# $ pointWeight : int 3 233 10 89 4 22 44 99 89 17 ...
mygam <- mgcv::gam(ProportionBirdsScavenging ~ OverheadCover + s(Area, bs="re"), family=betar(link="logit"), data = data_re, weights = pointWeight)
new.xgam <- expand.grid(OverheadCover = seq(0, 1, length.out = 1000))
new.xgam$Area <- "a" # pad new.xgam with an arbitrary value for variable Area -> https://stackoverflow.com/questions/54411851/mgcv-how-to-use-exclude-argument-in-predict-gam
new.ygam <- predict.gam(mygam, newdata = new.xgam, type = "terms", exclude = "s(Area)") # Because I have to specify type = "terms", I can't specify type = "response".
new.ygam <- data.frame(new.ygam)
head(new.ygam) # not on the response scale (0,1)
# OverheadCover
# 1 0.000000000
# 2 -0.004390776
# 3 -0.008781551
# 4 -0.013172327
# 5 -0.017563103
# 6 -0.021953878
You're misreading the documentation for the argument exclude
:
exclude: if
type=="terms"
ortype="iterms"
then terms (smooth or parametric) named in this array will not be returned. Otherwise any smooth terms named in this array will be set to zero. IfNULL
then no terms are excluded. Note that this is the term names as it appears in the model summary, see example. You can avoid providing the covariates for the excluded terms by settingnewdata.guaranteed=TRUE
, which will avoid all checks onnewdata
.
(emphasis mine).
You can use type = "response", exclude = "s(Area)")
and the random effect should be ignored. You do have to pass in to newdata
some values for Area
otherwise this won't work; just set the Area
column in the newdata
to be all the first level of Area
.
If you are very careful you can avoid passing in the ranef variable too. If you are sure that what you pass to newdata
is a correctly specified set of variables for the model, then you can leave out Area
and pass newdata.guaranteed = TRUE
to predict()
to stop predict()
from checking that you have correctly passed all variables needed for the model.
See the example in ?mgcv::random.effects
for both types of behaviour.