Search code examples
rsplinesmoothinggam

How to predict GAM with smooth terms and basic functions with independent data?


I attempt to fit a GAM model with interactions between days (tt variable) and lagged predictors (k=2) using k basis functions.

library(mgcv)
# Example data
data=data.frame(
tt=1:107, # days
pol=(sample.int(101,size=107,replace=TRUE)-1)/100,
at_rec=sample.int(101,size=107,replace=TRUE),
w_cas=sample.int(2000,size=107,replace=TRUE)
)

# model
gam1<-gam(pol ~ s(tt, k = 10) + 
            s(tt, by = Lag(at_rec, k = 2), k = 10)+
            s(tt, by = Lag(w_cas, k = 2), k = 10), 
          data=data,method="GACV.Cp")
summary(gam1)

# while making newdata
> newdata=data.frame(tt=c(12,22),at_rec=c(44,34), w_cas=c(2011,2455))
# and prediction
> predict(gam1,newdata=newdata,se.fit=TRUE)

I got this error "Error in PredictMat(object$smooth[[k]], data) : Can't find by variable"

How to predict such a model with new data?


Solution

  • I'm 99.9% sure that the predict method can't find the by terms because they are functions of variables and it's looking for variables with exactly the names you provided: "Lag(at_rec, k = 2)".

    Try adding those lagged variables to your data frame as explicit variables and refit the model and it should work:

    data <- transform(data,
                      lag_at_rec = Lag(at_rec, k=2),
                      lag_w_cas = Lag(w_cas, k=2))
    gam1 <- gam(pol ~ s(tt, k = 10) + 
                  s(tt, by = lag_at_rec, k = 10)+
                  s(tt, by = lag_w_cas, k = 10), 
                data = data, method = "GACV.Cp")