Search code examples
rstatisticslme4gammgcv

GAMM4 smoothing spline for time variable


I am constructing a GAMM model (for the first time) to compare longitudinal slopes of cognitive performance in a Bipolar Disorder (BD) sample, compared to a control (HC) sample. The study design is referred to as an "accelerated longitudinal study" where participants across a large span of ages 25-60, are followed for 2 years (HC group) and 4 years (BD group).

Hypothesis (1) The BD group’s yearly rate of change on processing speed will be higher overall than the healthy control group, suggesting a more rapid cognitive decline in BD than seen in HC.

Here is my R code formula, which I think is a bit off:

RUN2 <- gamm4(BACS_SC_R ~ group + s(VISITMONTH, bs = "cc") + 
s(VISITMONTH, bs = "cc", by=group), random=~(1|SUBNUM), data=Df, REML = TRUE)

The visitmonth variable is coded as "months from first visit." Visit 1 would equal 0, and the following visits (3 per year) are coded as months elapsed from visit 1. Is a cyclic smooth correct in this case?

I plan on adding additional variables (i.e peripheral inflammation) to the model to predict individual slopes of cognitive trajectories in BD.

If you have any other suggestions, it would be greatly appreciated. Thank you!


Solution

  • If VISITMONTH is over years (i.e. for a BD observation we would have VISITMONTH in {0, 1, 2, ..., 48} (for the four years)), then no, you don't want a cyclic smooth unless there is some 4-year periodicity that would mean 0 and 11 should be constrained to be the same.

    The default thin plate spline bs = 'tp' should suffice.

    I'm also assuming that there are many possible values for VISITMONTH as not everyone was followed up at the same monthly intervals? Otherwise you're not going to have many degrees of freedom available for the temporal smooth.

    Is group coded as an ordered factor here? If so that's great; the by smooth will encode the difference between the reference level (be sure to set HC as the reference level) and the other level so you can see directly in the summary a test for a difference of the BD group.

    It's not clear how you are dealing with the fact that HC are followed up over fewer months than the BD group. It looks like the model has VISITMONTH representing the full time of the study not just a winthin-year term. So how do you intend to compare the BD group with the HC group for the 2 years where the HC group are not observed?