Search code examples
rmgcv

mgcv: How to identify exact knot values in a gam and gamm model?


I am using gams to fit a resource selection function to identify energy development thresholds for migratory deer. My model looks like this:

m4 <-gam(used~ti(propwells_buff_res1500, bs = "cr", k = 5) +
ti(year, bs = "cr", k = 5) + 
ti(propwells_buff_res1500, year, bs = "cr", k = 5), 
family = binomial(link = "cloglog"), data=mov, gamma=1.4, method="ML")

used is animal used locations, propwells_buff_res1500 are randomly generated "available" points (buffered by 1500m radii circle) that have differing amounts of energy development within them. I've constrained knots to 5, however, I'd like to be able to extract exact knot values because it's my understanding that knot values represent threshold values...aka the percent of surface disturbance where animal use drops off.

I hope this makes sense. If it doesn't, all I want to know how to do is get knot values. From plot(m4) I can eyeball where the slope of my non-linear line starts to change, but it would be very helpful to know the exact values.

So far, I have tried:

smooth <- m4$smooth[[3]]

smooth$knots 
##this knot option isn't available to me, 
##I saw it in an old post from 2016, figured out that XP should replace knots

smooth$XP
##and all this returns is list()

I would truly appreciate any help, thanks.


Solution

  • To get the knots, you can extract the xp component of the marginal smooth terms (note it is lowercase xp as there is an XP at the top level of the smooth which is something else).

    Here's an example

    library('mgcv')
    ## simulate some data
    set.seed(1729)
    df <- gamSim(2) # this is a bivariate example
    ## fit the model
    mod <- gam(y ~ ti(x, bs = 'cr', k = 5) + 
                   ti(z, bs = 'cr', k = 5) +
                   ti(x, z, bs = rep('cr', 2), k = 5),
               data = df$data, method = 'REML')
    ## extract the 3rd smooth
    sm <- mod[['smooth']][[3]]
    

    The marginal bases are in sm$margin, which is simply a list of two smooth objects:

    r$> str(sm$margin, max = 1)                          
    List of 2
     $ :List of 21
      ..- attr(*, "class")= chr [1:2] "cr.smooth" "mgcv.smooth"
      ..- attr(*, "qrc")=List of 4
      .. ..- attr(*, "class")= chr "qr"
      ..- attr(*, "nCons")= int 1
     $ :List of 21
      ..- attr(*, "class")= chr [1:2] "cr.smooth" "mgcv.smooth"
      ..- attr(*, "qrc")=List of 4
      .. ..- attr(*, "class")= chr "qr"
      ..- attr(*, "nCons")= int 1
    

    Each of these has a xp component:

    sm_x <- sm$margin[[1]]
    sm_z <- sm$margin[[2]]
    

    Hence the knots for the marginal CRS of x are:

    r$> sm_x$xp
              0%          25%          50%          75%         100%
    0.0005697084 0.2477067126 0.4704501621 0.7121602102 0.9960833385
    

    and for z are

    r$> sm_z$xp
             0%         25%         50%         75%        100% 
    0.007381999 0.244705125 0.488819070 0.717802322 0.991505836
    

    Why these values? They are at the quintiles of the observed covariate values:

    r$> with(df$data, quantile(x, probs = seq(0, 1, length = 5)))
              0%          25%          50%          75%         100%
    0.0005697084 0.2477067126 0.4704501621 0.7121602102 0.9960833385
    r$> with(df$data, quantile(z, probs = seq(0, 1, length = 5)))
             0%         25%         50%         75%        100% 
    0.007381999 0.244705125 0.488819070 0.717802322 0.991505836
    

    Which is how mgcv places knots for the CRS basis. The exact locations can be recovered using place.knots():

    r$> with(df$data, place.knots(x, 5))
    [1] 0.0005697084 0.2477067126 0.4704501621 0.7121602102 0.9960833385
    r$> with(df$data, place.knots(z, 5))
    [1] 0.007381999 0.244705125 0.488819070 0.717802322 0.991505836
    

    but it is safer to pull the knots from the marginal smooth objects as a user could always specify knots via the knots argument to gam().