Search code examples
rmixed-modelsnon-linear-regressionnlmepiecewise

How to perform piecewise linear mixed regression with multiple breakpoints in R?


I am fitting a piecewise linear mixed regression in R. I know that I can use lme from the nlme package followed by segmented to perform piecewise linear mixed regression. However, upon reading the documentation of the package segmented, I noticed that segmented.lme could only handle 1 breakpoint, in which I have two (at day 30 and day 90).

For a background, I want to model the mileage of cars (macars) at day 0, 30, 90, and 180 (variable days), with age as a confounder. Note that this model is illustrative and not the real data.

This is my prototype code which uses the lme package, before I get confused after reading that segmented.lme can only handle 1 breakpoint:

fit <- lme(macars ~ days + age, random = ~ days | id, data = df)
summary(fit)
pw.fit <- segmented(fit, seg.Z = ~ days, psi = list(days = c(30, 90)),  random = list(id = pdDiag(~1 + days))
summary(pw.fit)

Any help would be greatly appreciated. Thank you very much


Solution

  • If you know where the breakpoints are, as suggested by @user255430, you can construct a linear B-spline basis (or more specifically ask the model function to construct one for you) via bs(days, knots = c(30, 90), degree = 1).

    However, this isn't parameterized the way you think it is.

    library(splines)
    days <- 1:200
    X <- bs(days, knots = c(30, 90), degree = 1)
    par(las = 1, bty = "l") ## cosmetic
    matplot(X, type = "l")
    

    enter image description here

    As you can see, the last section (days > 90) is predicted by the sum of the second (red dashed) and third (green dotted) components, not just the third component.

    You can use a truncated power basis spline instead:

    library(cplm)q
    ## set k=3 to suppress warning
    p <- tp(days, knots = c(30, 90), k = 3, degree = 1)
    X <- cbind(p$X, p$Z)
    matplot(X, type = "l")
    

    enter image description here

    However, this is somewhat inconvenient; for only two knots, you can include the components in your model via

    ~ ... days + I(days*(days>30)) + I(days*(days>90))
    

    In general you should at least consider including all of these terms in your random effects components as well.

    You can use emmeans::emmeans to predict the 'intercept' (predicted value) at different time points, and emmeans::emtrends to predict the slope at those time points.