Search code examples
rlistdplyrtidyrsegment

How to get specific elements and subelements from lists stored inside a data frame built with tidyr package?


I'm doing a piecewise linear regression within each group on a data frame. For this I'm using segmented and tidyr packages. My main objective is to build another data frame with all the data from the piecewise linear regressions (slopes, intercepts and R²)

So far I've been able to do this for R² and intercepts. However, the slope has been the hardest part and that's why I'm here asking for your help. The results provided by the segmented package are in the form of lists, and the most complex list is precisely the list of slopes, which is as follows (in the MWE I present how this list was built. Here it is for illustrative purposes only):

> Segmented.values$slopes
$x
        Est. St.Err. t value CI(95%).l CI(95%).u
slope1  1.00 0.10206  9.7980   0.56086   1.43910
slope2 -0.05 0.10206 -0.4899  -0.48914   0.38914

$x
        Est.  St.Err. t value CI(95%).l CI(95%).u
slope1  2.00 0.061237 32.6600   1.73650   2.26350
slope2 -0.05 0.061237 -0.8165  -0.31348   0.21348

From this list I just want to get the values of column Est. but despite my attempts I haven't been able to do that. My example:

library(segmented)
library(tidyr)
library(dplyr)

Group <- c("A", "B")
x <- 0:5
y <- c(0, 1, 2, 2.1, 2.3, 2,
       0, 2, 4, 4.5, 4.3, 4.4)

df <- expand.grid(x = x,
                  Group = Group)

df$y <- y

Segmented.values <- df %>%
  nest_by(Group) %>%
  mutate(my.lm = list(lm(data = data,
                         formula = y ~ x)),
         my.seg = list(segmented(my.lm,
                                 seg.Z = ~ x)),
         intercepts = list(intercept(my.seg)),
         slopes = list(slope(my.seg)),
         R2 = list(summary(my.seg)$r.squared)) %>%
  unnest(c(intercepts,
           slopes,
           R2)) %>%
  select(-data,
         -my.lm,
         -my.seg,
         -intercepts,
         -R2) %>%
  unnest_wider(slopes,           
               names_sep = " ")

And the result is a table with all the slope data, but I only need the columns called slopes 1 and slopes 2 (or the Est. column of the list that contains these values)

> Segmented.values
# A tibble: 2 x 11
# Groups:   Group [2]
  Group `slopes 1` `slopes 2` `slopes 3` `slopes 4` `slopes 5` `slopes 6` `slopes 7` `slopes 8` `slopes 9` `slopes 10`
  <fct>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>       <dbl>
1 A              1      -0.05     0.102      0.102        9.80     -0.490      0.561     -0.489       1.44       0.389
2 B              2      -0.05     0.0612     0.0612      32.7      -0.816      1.74      -0.313       2.26       0.213

Solution

  • The slopes column have many columns in addition to 'Est.' i.e. if we extract one of the list elements

            Est. St.Err. t value CI(95%).l CI(95%).u
    slope1  1.00 0.10206  9.7980   0.56086   1.43910
    slope2 -0.05 0.10206 -0.4899  -0.48914   0.38914
    

    We could extract only the 'Est.', before doing the unnest_wider

    library(dplyr)
    library(purrr)
    library(tidyr)
    library(segmented)
    df %>%
      nest_by(Group) %>%
      transmute(my.lm = list(lm(data = data,
                             formula = y ~ x)),
             my.seg = list(segmented(my.lm,
                                     seg.Z = ~ x)),
             intercepts = list(intercept(my.seg)),
             slopes = list(slope(my.seg)),
             R2 = list(summary(my.seg)$r.squared)) %>% 
        ungroup %>% 
        select(Group, slopes) %>% 
        unnest(slopes) %>%
        mutate(slopes = map(slopes, ~ .x[, "Est."])) %>% # get the 'Est.' column
        unnest_wider(slopes)
    

    -output

    # A tibble: 2 x 3
      Group slope1 slope2
      <fct>  <dbl>  <dbl>
    1 A          1  -0.05
    2 B          2  -0.05