I´ve ben analyzing data for a paper and have now obtained results in mutiple linear regression. However the summaries R provides are not really fir for publication in the final paper. Also I have specified one variable in several different ways, to showcase the robustness of the results.
Hw can I create a nice, exportable table in R, that contains Variable Names (ideally also allows to name the variables in a more informative way), estimates, standard errors, robust standard errors p values and ideally also the significance indicators? For illustration:
I have summary outputs like this:
Residuals:
Min 1Q Median 3Q Max
-50.868 -4.644 1.583 7.054 20.490
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.710e+01 1.848e+01 2.549 0.0136 *
Var1 -8.588e-01 2.201e+00 -0.390 0.6979
Var2 2.486e+00 1.055e+00 2.357 0.0220 *
log(specification1) 3.376e+00 2.152e+00 1.569 0.1223
Var4 -3.651e-04 2.797e-04 -1.305 0.1971
Var5 4.809e+00 2.654e+00 1.812 0.0753 .
Var6 -8.706e+00 6.972e+00 -1.249 0.2170
Var7 -8.172e+00 5.755e+00 -1.420 0.1612
Var8 -3.276e+00 7.067e+00 -0.463 0.6448
Var9 -1.477e+01 7.849e+00 -1.882 0.0650 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
and
Residuals:
Min 1Q Median 3Q Max
-48.881 -5.699 0.956 8.947 17.888
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.258e+01 1.750e+01 2.405 0.0195 *
Var1 4.298e-01 2.120e+00 0.200 0.8421
Var2 5.179e+00 1.027e+00 2.122 0.0271 *
log(specification 2) 2.050e+00 9.435e-01 2.173 0.0338 *
Var4 -1.420e-04 2.261e-04 -1.513 0.1356
Var5 4.584e+00 2.511e+00 1.826 0.0730 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
and I would like to get to a table looking something like this:
Model1 Model2
Intercept Estimate Std.Error p robust_Std.Error robust_p Estimate Std.Error p robust ...
Var1
Var2
Var3
Var4
Var5
Var6
Var7
Var8
Var9
which in the columns of course contains the values of the estimates. Is there a function/ package that does that nicely?
Thanks in advance
I suggest you to use the broom
package, like this:
fit1 <- lm(mpg ~ ., mtcars)
broom::tidy(fit1)
# # A tibble: 11 x 5
# term estimate std.error statistic p.value
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 (Intercept) 12.3 18.7 0.657 0.518
# 2 cyl -0.111 1.05 -0.107 0.916
# 3 disp 0.0133 0.0179 0.747 0.463
# 4 hp -0.0215 0.0218 -0.987 0.335
# 5 drat 0.787 1.64 0.481 0.635
# 6 wt -3.72 1.89 -1.96 0.0633
# 7 qsec 0.821 0.731 1.12 0.274
# 8 vs 0.318 2.10 0.151 0.881
# 9 am 2.52 2.06 1.23 0.234
# 10 gear 0.655 1.49 0.439 0.665
# 11 carb -0.199 0.829 -0.241 0.812
It will extract a tibble
from the output of the lm
function.
If you have more than one model and you wanna set all the tibbles together with common terms you can deal with it this way:
Create a list x
of your models.
fit1 <- lm(mpg ~ cyl + disp + gear, mtcars)
fit2 <- lm(mpg ~ cyl + hp + drat, mtcars)
x <- list(fit1, fit2)
You can use this solution:
library(purrr)
library(dplyr)
library(stringr)
# set names for the list
names(x) <- paste("Model", seq_along(x), sep = "_")
# tidy them up
x <- map(x, broom::tidy)
# set the list names at the beginning of each column
x <- imap(x, ~set_names(.x, paste(.y, names(.x), sep = "_")))
# rename each term column as "term"
x <- map(x, ~rename_with(.x, str_replace, pattern = ".*term", replacement = "term"))
# join them all together
reduce(x, full_join, by = "term")
It returns the output you asked for:
# A tibble: 6 x 9
term Model1_estimate Model1_std.error Model1_statistic Model1_p.value Model2_estimate Model2_std.error Model2_statistic Model2_p.value
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 34.0 4.76 7.13 0.0000000925 22.5 7.99 2.82 0.00880
2 cyl -1.59 0.724 -2.20 0.0366 -1.36 0.735 -1.85 0.0747
3 disp -0.0200 0.0109 -1.83 0.0774 NA NA NA NA
4 gear 0.158 0.910 0.174 0.863 NA NA NA NA
5 hp NA NA NA NA -0.0288 0.0153 -1.88 0.0704
6 drat NA NA NA NA 2.84 1.52 1.87 0.0725
If your list has more than two models, the code will be stable.