Search code examples
rgammgcv

R: GAM with fit on subset of data


I fit a Generalized Additive Model using gam from the mgcv package. I have a data table containing my dependent variable Y, an independent variable X, other independent variables Oth and a two-level factor Fac. I would like to fit the following model

Y ~ s(X) + Oth

BUT with the additional constraint that the s(X) term is fit only on one of the two levels of the factor, say Fac==1. The other terms Oth should be fit with the whole data.

I tried exploring s(X,by=Fac) but this biases the fit for Oth. In other words, I would like to express the belief that X relates to Y only if Fac==1, otherwise it does not make sense to model X.


Solution

  • If I understand it right, you're thinking about some model with interaction like this:

    Y ~ 0th + (Fac==1)*s(X)  
    

    If you want to "express the belief that X relates to Y only if Fac==1" don't treat Fac as a factor, but as a numeric variable. In this case you will get numeric interaction and only one set of coefficients (when it's a factor there where two). This type of model is a varying coefficient model.

    # some data
    data <- data.frame(th = runif(100),
                  X = runif(100),
                  Y = runif(100),
                  Fac = sample(0:1, 100, TRUE))
    data$Fac<-as.numeric(as.character(data$Fac)) #change to numeric
    # then run model
    gam(Y~s(X, by=Fac)+th,data=data)
    

    See the documentation for by option in the documentation ?s