Search code examples
rglmgammgcv

How can I apply interaction term with linear predictors in mgcv package


I have a question about using interaction term in mgcv package with 2 linear predictor.

I wrote the code to fit interaction between x1 and x2,

mgcv::gam(y ~ x1 + x2 + ti(x1, x2, k = 3), method = "REML", data = DATA)

but I'm really not sure this is correct way to using ti function in mgcv.

Should I use different method (like glm?) to do this work?

Thank you for your answer.


Solution

  • If you want a non-linear interaction, you have two choices:

    gam(y ~ te(x1, x2), method = "REML", data = DATA)
    

    or

    gam(y ~ s(x1) + s(x2) + ti(x1, x2), method = "REML", data = DATA)
    

    In the first model, the main effects and interactions are bound up in the one two-dimensional function te(x1,x2).

    In the second model, because ti(x1, x2) will exclude the main smooth effects or the marginal terms (x1, x2), you should also include the main smooth effects of these terms too using s().

    With your formulation, you would not get any non-linear main effects, only linear main effects and a non-linear pure-interaction, which is likely not what you want.

    Here's an example:

    library("mgcv")
    library("gratia")
    library("ggplot2")
    library("patchwork")
    
    set.seed(1)
    df2 <- data_sim(2, n = 1000, dist = "normal", scale = 1, seed = 2)
    
    m1 <- gam(y ~ s(x, k = 5) + s(z, k = 5) + ti(x, z, k = 5),
              data = df2,
              method = "REML")
    m2 <- gam(y ~ x + z + ti(x, z, k = 5),
              data = df2
              method = "REML")
    
    pl <- plot_layout(nrow = 1, ncol = 3)
    p1 <- draw(m1) + pl
    p2 <- draw(m2, parametric = TRUE) + pl
    
    p1 - p2 + plot_layout(nrow = 2)
    

    which produces

    enter image description here

    Notice how in this case, you would be missing the non-linear in the marginal smooths/terms, which is not accounted for in the ti() term, because it has no marginal main effects (the ti() smooths are the same across both models).

    If you just want to fit a linear "main effects" and their linear interaction, just use the formula as you would with glm():

    gam(y ~ x1 + x2 + x1:x2, ....)
    

    Note that the term "linear predictor" refers to the entire model (in the case), or more specifically the entire formula on the RHS of ~.