Search code examples
rnested-lists

Intelligently extract linear model coefficients from a nested list


I'm an R veteran who usually hates working with nested lists because of how tricky they seem get... but I am not sure I can avoid them here. In this case, I can make the output that I want, but I'm clueless about how to intelligently take apart the list once it's made.

I'm trying to make n linear models from a dataset for each level of a class. After running all of the models, I want a simple table that houses the slopes, intercepts, and classes for each level. The example below is what I'm after:

# Dummy data
d <- data.frame(x=rnorm(50, 10, 1), 
                y=rnorm(50, 0, 2), 
                class=c(rep('A',10),rep('B',10),rep('C',10),rep('D',10),rep('E',10)))

# Split the data by grouping variable
d.s <-  split(d, d$class)

# Create a linear model from y~x in each class
coeffs <- function(df) {
  m <- lm(y~x, data = df)$coefficients
}

m.s <- lapply(d.s, coeffs)

# How do I neatly get a data frame that looks like below out of m.s??

wanted <- data.frame(class=as.character(), slope=as.numeric(), intercept=as.numeric())

Please excuse my aversion and inexperience with nested lists! I can get what I want after many lines of unlisting and picking apart the rownames, but there's got to be a better way. I'm trying to change...


Solution

  • You could use sapply in the first place,

    > t(sapply(d.s, coeffs))
      (Intercept)          x
    A   9.4390072 -0.8430022
    B  -5.2018384  0.4988027
    C  -7.9531678  0.8298487
    D  -0.9505984  0.1192621
    E  -1.8155237  0.1522270
    
    > data.frame(sort(unique(d$class)), t(sapply(d.s, coeffs))) |> 
    +   setNames(c('class', 'intercept', 'slope'))
      class  intercept      slope
    A     A  9.4390072 -0.8430022
    B     B -5.2018384  0.4988027
    C     C -7.9531678  0.8298487
    D     D -0.9505984  0.1192621
    E     E -1.8155237  0.1522270
    

    Or if you depend on lapply, rbind result.

    > data.frame(names(m.s), do.call('rbind', m.s)) |> 
    +   setNames(c('class', 'intercept', 'slope'))
      class  intercept      slope
    A     A  9.4390072 -0.8430022
    B     B -5.2018384  0.4988027
    C     C -7.9531678  0.8298487
    D     D -0.9505984  0.1192621
    E     E -1.8155237  0.1522270
    

    If you really want slope before intercept, do

    > data.frame(class=names(m.s), do.call('rbind', m.s)[, 2:1]) |> 
    +   setNames(c('class', 'slope', 'intercept'))
      class      slope  intercept
    A     A -0.8430022  9.4390072
    B     B  0.4988027 -5.2018384
    C     C  0.8298487 -7.9531678
    D     D  0.1192621 -0.9505984
    E     E  0.1522270 -1.8155237