Search code examples
rlinear-regressionlme4mixed-modelsconfidence-interval

Obtaining intercept and slope per patient (as diagnostics in a repeated measurements study) using the lmLst and intervals functions


I have a repeated measurements dataset of 24 stroke patients in which I want to assess the effect of three different types of rehabilitation (Group) on functional recovery scores (Barthel_index). Each patients functional ability was measured weekly (Time_num) for 8 weeks.

The data looks as follows:

library(dplyr)
library(magrittr)
library(nlme)
library(lmer)

mydata <- 
structure(list(Subject = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 
11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 
16L, 16L, 16L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 
18L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 
19L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 21L, 21L, 21L, 21L, 
21L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 23L, 
23L, 23L, 23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 24L, 24L, 
24L, 24L), Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
    Time_num = c(1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 
    8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 
    3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 
    6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 
    1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 
    4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 
    7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 
    2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 
    5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 
    8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 
    3, 4, 5, 6, 7, 8), Barthel_index = c(45L, 45L, 45L, 45L, 
    80L, 80L, 80L, 90L, 20L, 25L, 25L, 25L, 30L, 35L, 30L, 50L, 
    50L, 50L, 55L, 70L, 70L, 75L, 90L, 90L, 25L, 25L, 35L, 40L, 
    60L, 60L, 70L, 80L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
    100L, 20L, 20L, 30L, 50L, 50L, 60L, 85L, 95L, 30L, 35L, 35L, 
    40L, 50L, 60L, 75L, 85L, 30L, 35L, 45L, 50L, 55L, 65L, 65L, 
    70L, 40L, 55L, 60L, 70L, 80L, 85L, 90L, 90L, 65L, 65L, 70L, 
    70L, 80L, 80L, 80L, 80L, 30L, 30L, 40L, 45L, 65L, 85L, 85L, 
    85L, 25L, 35L, 35L, 35L, 40L, 45L, 45L, 45L, 45L, 45L, 80L, 
    80L, 80L, 80L, 80L, 80L, 15L, 15L, 10L, 10L, 10L, 20L, 20L, 
    20L, 35L, 35L, 35L, 45L, 45L, 45L, 50L, 50L, 40L, 40L, 40L, 
    55L, 55L, 55L, 60L, 65L, 20L, 20L, 30L, 30L, 30L, 30L, 30L, 
    30L, 35L, 35L, 35L, 40L, 40L, 40L, 40L, 40L, 35L, 35L, 35L, 
    40L, 40L, 40L, 45L, 45L, 45L, 65L, 65L, 65L, 80L, 85L, 95L, 
    100L, 45L, 65L, 70L, 90L, 90L, 95L, 95L, 100L, 25L, 30L, 
    30L, 35L, 40L, 40L, 40L, 40L, 25L, 25L, 30L, 30L, 30L, 30L, 
    35L, 40L, 15L, 35L, 35L, 35L, 40L, 50L, 65L, 65L)), row.names = c(NA, 
-192L), class = c("tbl_df", "tbl", "data.frame"))

head(mydata)

# A tibble: 6 x 4
  Subject Group Time_num Barthel_index
    <int> <fct>    <dbl>         <int>
1       1 A            1            45
2       1 A            2            45
3       1 A            3            45
4       1 A            4            45
5       1 A            5            80
6       1 A            6            80

To see if and how intercepts and slopes vary per patient I want to plot the intercepts and slopes using the lmList and interval functions.

Question 1 I don't understand why calling the lmList function () in lme4 gives me 48 warnings while the same function in nlme does not:

lmlist <- 
  lme4::lmList(Barthel_index ~ Time_num | Subject, 
         data=mydata)
> There were 48 warnings (use warnings() to see them)

lmlist <- 
  nlme::lmList(Barthel_index ~ Time_num | Subject, 
         data=mydata)
# Works fine

Question 2 I am trying to extract the confidence intervals for each regression slope, but this gives a warning and NaN for certain values:

lmlist <- 
  nlme::lmList(Barthel_index ~ Time_num | Subject, 
         data=mydata)
coefs <- coef(lmlist)
names(coefs) <- c("Intercepts", "Slopes")
intervals(lmlist)

> Warning message:
In summary.lm(el) : essentially perfect fit: summary may be unreliable

Question 3 Now that I have my new list of coefficients with confidence intervals, I'd like to plot them to see if and how much intercepts and slopes vary amongst patients. I'm trying to achieve something like the following:

enter image description here Any help? Thanks.


Solution

  • Q1. The warnings are occurring in lme4::lmList because you're using a tibble as input: no warnings from

    lme4::lmList(Barthel_index ~ Time_num | Subject,
             data=as.data.frame(mydata))
    

    (this is a harmless "infelicity" or buglet in lme4 ...)

    Q2. If you look at the list of coefficients, you'll see that subject 5 is the problematic one. The data for this subject all have the same response value: thus it's not surprising that we can't compute confidence intervals on a linear regression fit ...

     mydata[mydata$Subject=="5",]
    # A tibble: 8 × 4
      Subject Group Time_num Barthel_index
        <int> <fct>    <dbl>         <int>
    1       5 A            1           100
    2       5 A            2           100
    3       5 A            3           100
    4       5 A            4           100
    5       5 A            5           100
    6       5 A            6           100
    7       5 A            7           100
    8       5 A            8           100
    

    Q3 plot(intervals(lmlist))

    enter image description here