Search code examples
rpredictmixed-modelscubic-spline

Getting predictions from mlrcs function in merlin package in R


I am using the mlrcs function in the merlin package to fit a restricted cubic spline model with a random variable. The model works great, but there does not seem to be any functionality for getting model predictions or contrasts. The "predict" function does not work for example. The description of the mlrcs function online does not give any information about how to do post estimation. Is it possible with the mlrcs function? Or do I need to write my model code using the merlin function? If the latter, can someone help me with the syntax? If I try to just change the mlrcs function to merlin, the code does not run. I've pasted a dummy example below, with a structure similar to my real dataset.

dat = as.data.frame(list(fish = as.factor(c(rep("a",6),rep("b",6),rep("c",6),rep("d",6))),
                         value = as.numeric(c(1,3,7,7,6,7,2,4,8,7,7,6,5,8,10,11,12,10,3,7,9,9,8,9)),
                         time = as.numeric(rep(1:6,4)),
                         location = as.numeric(c(rep("0",6),rep("1",6)))))
                    
dat
str(dat)


library(ggplot2)

ggplot(dat, aes(x=time, y=value, group=fish, col=fish)) +
  geom_line()

Here, there were 4 fish in the experiment (a-d), and at each point in time (6 time points), a value was measured for each fish. Location is a dummy variable that tells where the fish came from. Here, fishes "a" and "b" came from location "0" while fishes "c" and "d" came from location "1". Since it is a repeated measures design, fish ID is included as a random variable.

My goal is to examine how value changes over time and whether location is a significant factor (also if there is a significant time x location interaction). Here is the model I successfully fit:

mod <- mlrcs(formula = value ~ 1 + rcs(time, 3) + location + time:location, random  = ~ 1|fish, data = dat)
summary(mod)

This model gives me the below output, which is as expected:

> summary(mod)
Restricted cubic splines model
Log likelihood = -29.25573

               Estimate Std. Error      z Pr(>|z|) [95% Conf. Interval]
rcs():1         1.89559    0.18222 10.402   0.0000    1.53843   2.25274
rcs():2         1.29086    0.12892 10.013   0.0000    1.03818   1.54355
rcs():3        -0.01131    0.12886 -0.088   0.9301   -0.26386   0.24125
location       -2.79342    0.63706 -4.385   0.0000   -4.04204  -1.54480
time:location  -0.24013    0.15088 -1.592   0.1115   -0.53585   0.05558
_cons           9.26707    0.24533 37.773   0.0000    8.78623   9.74792
log_sd(resid.) -0.46044    0.14631 -3.147   0.0017   -0.74721  -0.17366
log_sd(M1)      0.53407    0.08140  6.561   0.0000    0.37453   0.69361

Now, I want to get model predictions, so I can plot them. I also want to get contrasts between the two locations for each level of time (i.e., at time point 1, is there a significant difference in predicted value between location "0" and location "1"?). Please see the errors I get below.

> predict(mod)
Error in UseMethod("predict") : 
  no applicable method for 'predict' applied to an object of class "mlrcs"


#Try with changing to merlin function, changed "formula" to "model":
> mod2 <- merlin(model = value ~ 1 + rcs(time, 3) + location + time:location, random  = ~ 1|fish, data = dat)
Error in merlin(model = value ~ 1 + rcs(time, 3) + location + time:location,  : 
  unused argument (random = ~1 | fish)


Solution

  • Reading the documentation for this model, I was able to make mod2 yield the same output as mod by adding "*1" to the M1 argument. The documentation says that adding this constrains the random effect term to 1.

    Here is the code:

    mod3 = merlin(
      model = value ~ rcs(time, 3) + location + time:location + M1[fish]*1,
      family = "gaussian",
      data = dat,
      levels = c("fish")
    )
    
    summary(mod3)
    Mixed effects regression model
    Log likelihood = -29.25573
    
                   Estimate Std. Error      z Pr(>|z|) [95% Conf. Interval]
    rcs():1         1.89559    0.18222 10.402   0.0000    1.53843   2.25274
    rcs():2         1.29086    0.12892 10.013   0.0000    1.03818   1.54355
    rcs():3        -0.01131    0.12886 -0.088   0.9301   -0.26386   0.24125
    location       -2.79342    0.63706 -4.385   0.0000   -4.04204  -1.54480
    time:location  -0.24013    0.15088 -1.592   0.1115   -0.53585   0.05558
    _cons           9.26707    0.24533 37.773   0.0000    8.78623   9.74792
    log_sd(resid.) -0.46044    0.14631 -3.147   0.0017   -0.74721  -0.17366
    log_sd(M1)      0.53407    0.08140  6.561   0.0000    0.37453   0.69361
    
    Integration method: Non-adaptive Gauss-Hermite quadrature 
    Integration points: 7 
    class(mod3)
    [1] "merlin"
    

    As an aside, checking the methods on predict shows that it appears to have a method for objects of class merlin but not of class mlrcs. This is why only the merlin function works with predict.

    methods(predict)
     [1] predict.ar*                predict.Arima*            
     [3] predict.arima0*            predict.glm               
     [5] predict.HoltWinters*       predict.lm                
     [7] predict.loess*             predict.merlin*           
     [9] predict.mlm*               predict.nls*              
    [11] predict.poly*              predict.ppr*              
    [13] predict.prcomp*            predict.princomp*         
    [15] predict.smooth.spline*     predict.smooth.spline.fit*
    [17] predict.StructTS*