I am running a regression in R with a lot of time and location fixed effects. I try to output a nice summary table into Latex. I switched from stargazer
package to huxtable
because stargazer
does not behave consistently when omitting fixed effects (see this question).
Here is a simple example:
library(huxtable)
reg1 <- lm(mpg ~ disp, data = mtcars)
reg2 <- lm(mpg ~ disp + factor(gear) + factor(carb), data = mtcars)
huxreg(reg1, reg2)
The output of huxreg
is:
> huxreg(reg1, reg2)
────────────────────────────────────────────────────
(1) (2)
───────────────────────────────────
(Intercept) 29.600 *** 25.533 ***
(1.230) (2.996)
disp -0.041 *** -0.018
(0.005) (0.011)
factor(gear)4 3.988
(2.495)
factor(gear)5 5.391 *
(2.591)
factor(carb)2 -1.979
(1.667)
factor(carb)3 -4.161
(2.131)
factor(carb)4 -6.199 *
(2.221)
factor(carb)6 -8.557 *
(3.653)
factor(carb)8 -10.389 *
(4.268)
───────────────────────────────────
N 32 32
R2 0.718 0.828
logLik -82.105 -74.186
AIC 170.209 168.372
────────────────────────────────────────────────────
*** p < 0.001; ** p < 0.01; * p < 0.05.
Column names: names, model1, model2
Here is the desired output:
────────────────────────────────────────────────────
(1) (2)
───────────────────────────────────
(Intercept) 29.600 *** 25.533 ***
(1.230) (2.996)
disp -0.041 *** -0.018
(0.005) (0.011)
───────────────────────────────────
Gear FE No Yes
Carb FE No Yes
───────────────────────────────────
N 32 32
R2 0.718 0.828
logLik -82.105 -74.186
AIC 170.209 168.372
────────────────────────────────────────────────────
*** p < 0.001; ** p < 0.01; * p < 0.05.
Column names: names, model1, model2
I know I could simply edit the huxtable using add_rows()
, but I am looking for a more robust solution that would allow to find rownames using regular expressions (like stargazer's omit.labels
option).
I wrote the answer myself, using this as inspiration.
The function check_factors()
determines if the particular variables are present in the model, and then sapply()
is used to create the rows that are added in the table. This is not fully automatic, though, since I still have to check if all the variables listed for omit_coef
were later tested by check_factors()
. It is possible to omit a variable and then forget to add a corresponding row.
library(huxtable)
reg1 <- lm(mpg ~ disp, data = mtcars)
reg2 <- lm(mpg ~ disp + factor(gear) + factor(carb), data = mtcars)
huxreg(reg1, reg2)
gear_factors <- tidy(reg2) %>%
filter(str_detect(term, "factor\\(gear\\)")) %>% ## in R, you have to escape the escape, hence \\
pull(term)
carb_factors <- tidy(reg2) %>%
filter(str_detect(term, "factor\\(carb\\)")) %>%
pull(term)
check_factors <- function(model, factors) {
return(all(factors %in% (tidy(model) %>% pull(term))))
}
models_report <- list(reg1 , reg2)
huxreg(models_report,
omit_coefs = c(gear_factors, carb_factors)) %>%
# add the rows with with True/false returned by check_factors() replased with "Yes"/"No"
add_rows(rbind(c("Gear FE",
ifelse(sapply(models_report,
check_factors,
factors=gear_factors),
"Yes", "No")),
c("Carb FE",
ifelse(sapply(models_report,
check_factors,
factors=carb_factors),
"Yes", "No"))),
copy_cell_props = FALSE, # this will prevent horizontal lines from appearing
after = nrow(.) - 5)
This produces the following table:
────────────────────────────────────────────────────
(1) (2)
───────────────────────────────────
(Intercept) 29.600 *** 25.533 ***
(1.230) (2.996)
disp -0.041 *** -0.018
(0.005) (0.011)
───────────────────────────────────
Gear FE No Yes
Carb FE No Yes
N 32 32
R2 0.718 0.828
logLik -82.105 -74.186
AIC 170.209 168.372
────────────────────────────────────────────────────
*** p < 0.001; ** p < 0.01; * p < 0.05.
Column names: names, model1, model2