Search code examples

Do regression analysis for all the variable X and response G, for all data frames found under one data frame in R

I have a data frame (df) which looks like this: Amount Response
1          5       10
1         10       25
1          2       20
2         12       20
2          4        8
2          3        5

and I have split the data.frame into several data.frames based on their group number with

  out <- split( df , f = df$ )

Now what I want is to do a regression analysis with lm between the amount ~ response for all the new data.frames in the "out" Please consider this is an example and I have 500 splitted data.frames in "out"


  • Assume the data shown reproducibly in the Note at the end. Specify pool = FALSE as an lmList argument if you don't want to pool the standard errors.

    # 1
    lmList(Response ~ Amount |, DF)

    An alternative is:

    # 2
    lm(Response ~ grp / (Amount + 1) - 1, transform(DF, grp = factor(

    or this which carries out completely separate regressions:

    # 3
    by(DF, DF$, function(DF) lm(Response ~ Amount, DF))

    This last line can also be written:

    # 3a
    by(DF, DF$, lm, formula = Response ~ Amount)

    R squared

    We can compute R squared by group using any of these:

    summary(lmList(Response ~ Amount |, DF))$r.squared
    c(by(DF, DF$, function(x) summary(lm(Response ~ Amount, x))$r.squared))
    reg.list <- by(DF, DF$, lm, formula = Response ~ Amount)
    sapply(reg.list, function(x) summary(x)$r.squared)
    c(by(DF, DF$, with, cor(Response, Amount)^2))
    DF %>%
      group_by( %>%
      do(summarize(., r.squared = summary(lm(Response ~ Amount, .))$r.squared)) %>%


    Lines <- " Amount Response
    1          5       10
    1         10       25
    1          2       20
    2         12       20
    2          4        8
    2          3        5"
    DF <- read.table(text = Lines, header = TRUE)