Search code examples
rmergeoutputconcatenationlinear-regression

Combing regression output for by() into a single table


I'm new to R, coding, and Stack Overflow: Apologies in advance if this is a basic question. I'm trying to combine the regression output for 3 levels of the variable "Gender" into a single summary table that retains all of the information from the columns as well as the values (residual error, r2, adjusted r2, F-statistic, p-value) listed at the bottom of each output. Is anyone aware of an approach that works?

Here is what my output currently looks like:

library(tidyverse)
Final_Frame.df <- read_csv("indirect.csv")

my.fun <- function(Final_Frame2.df){summary(lm(Product_Use~Mean_social_combined +
  Mean_traditional_time+
  Mean_Passive_Use_Updated+
  Mean_Active_Use_Updated, data=Final_Frame.df))}

by(Final_Frame.df, list(Final_Frame.df$Gender), my.fun)

Output

Call:
lm(formula = Product_Use ~ Mean_social_combined + Mean_traditional_time + 
    Mean_Passive_Use_Updated + Mean_Active_Use_Updated, data = Final_Frame.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-26.592  -8.178  -3.936   6.228  62.258 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -0.5814     1.9664  -0.296 0.767612    
Mean_social_combined       2.4961     1.1797   2.116 0.034906 *  
Mean_traditional_time      1.0399     0.7416   1.402 0.161567    
Mean_Passive_Use_Updated   2.8230     0.8308   3.398 0.000739 ***
Mean_Active_Use_Updated    2.7562     1.7421   1.582 0.114329    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.07 on 451 degrees of freedom
  (18 observations deleted due to missingness)
Multiple R-squared:  0.1517,    Adjusted R-squared:  0.1442 
F-statistic: 20.17 on 4 and 451 DF,  p-value: 2.703e-15

--------------------------------------------------------------------------------------------- 
: 2

Call:
lm(formula = Product_Use ~ Mean_social_combined + Mean_traditional_time + 
    Mean_Passive_Use_Updated + Mean_Active_Use_Updated, data = Final_Frame.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-26.592  -8.178  -3.936   6.228  62.258 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -0.5814     1.9664  -0.296 0.767612    
Mean_social_combined       2.4961     1.1797   2.116 0.034906 *  
Mean_traditional_time      1.0399     0.7416   1.402 0.161567    
Mean_Passive_Use_Updated   2.8230     0.8308   3.398 0.000739 ***
Mean_Active_Use_Updated    2.7562     1.7421   1.582 0.114329    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.07 on 451 degrees of freedom
  (18 observations deleted due to missingness)
Multiple R-squared:  0.1517,    Adjusted R-squared:  0.1442 
F-statistic: 20.17 on 4 and 451 DF,  p-value: 2.703e-15

--------------------------------------------------------------------------------------------- 
: 3

Call:
lm(formula = Product_Use ~ Mean_social_combined + Mean_traditional_time + 
    Mean_Passive_Use_Updated + Mean_Active_Use_Updated, data = Final_Frame.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-26.592  -8.178  -3.936   6.228  62.258 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -0.5814     1.9664  -0.296 0.767612    
Mean_social_combined       2.4961     1.1797   2.116 0.034906 *  
Mean_traditional_time      1.0399     0.7416   1.402 0.161567    
Mean_Passive_Use_Updated   2.8230     0.8308   3.398 0.000739 ***
Mean_Active_Use_Updated    2.7562     1.7421   1.582 0.114329    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.07 on 451 degrees of freedom
  (18 observations deleted due to missingness)
Multiple R-squared:  0.1517,    Adjusted R-squared:  0.1442 
F-statistic: 20.17 on 4 and 451 DF,  p-value: 2.703e-15


Solution

  • 1) broom This will produce a data frame of the coefficients and another of the statistics using tidy and glance from the broom package:

    library(broom)
    library(dplyr)
    
    mtcars %>%
      group_by(cyl) %>%
      group_modify(~ tidy(lm(mpg ~ disp + hp, .))) %>%
      ungroup
    
    mtcars %>%
      group_by(cyl) %>%
      group_modify(~ glance(lm(mpg ~ disp + hp, .))) %>%
      ungroup
    

    2) combined model Although not equivalent it would be possible to create a single model. It does produce the same coefficients.

    summary(lm(mpg ~ factor(cyl)/(disp + hp) + 0, mtcars))
    

    3) nlme Also this gives some of the same information. nlme comes with R so it does not have to be installed, only loaded using library as below.

    library(nlme)
    summary(lmList(mpg ~ disp + hp | cyl, mtcars, pool = FALSE))