I am doing multivariable regression on a list of outcome variables with a consistent set of independent variables. For univariable regression, I have followed this example to use tl_uvregression
from gtsummary
on a nested data frame, but I am trying to generalize this to multivariable regression using tbl_regression
on a nested data frame, and when I try to unnest
the tables, I get the error that "the input must be a list of vectors." Below is what I have tried - I assume there's some small but critical step that I'm missing, but I can't figure out what it is. My desired output is a table of multivariable regression output, with each model as a column and all the covariates as rows (similar to performing tbl_merge
on a list of each of these multivariable models run separately in tbl_regression
).
library(tidyverse)
library(magrittr)
library(gtsummary)
library(broom)
id <- 1:2000
gender <- sample(0:1, 2000, replace = T)
age <- sample(17:64, 2000, replace = T)
race <- sample(0:1, 2000, replace = T)
health_score <- sample(0:25, 2000, replace = T)
cond_a <- sample(0:1, 2000, replace = T)
cond_b <- sample(0:1, 2000, replace = T)
cond_c <- sample(0:1, 2000, replace = T)
cond_d <- sample(0:1, 2000, replace = T)
df <- data.frame(id, gender, age, race, health_score, cond_a, cond_b, cond_c, cond_d)
regression_tables <- df %>% select(-id) %>%
gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition", value = "case") %>%
group_by(condition) %>% nest() %>%
mutate(model = map(data, ~glm(case ~ gender + age + race + health_score, family = "binomial", data = .)),
table = map(model, ~tbl_regression, exponentiate = T, conf.level = 0.99)) %>%
select(table) %>% unnest(table)
The issue seems to be the use of lambda expression (~
) and without making use of it i.e specifying the arguments. Also, there are no tidy
methods available (from broom
) to extract into a tibble format from tbl_regression
library(dplyr)
library(tidyr)
library(broom)
library(gtsummary)
out <- df %>%
select(-id) %>%
gather(c(cond_a, cond_b, cond_c, cond_d), key = "condition",
value = "case") %>%
group_by(condition) %>%
nest() %>%
mutate(model = map(data,
~glm(case ~ gender + age + race + health_score,
family = "binomial", data = .)),
table = map(model, tbl_regression, exponentiate = T, conf.level = 0.99)) %>%
select(table)
out$table[[1]]
In addition to the OP's method of using map
to loop over, in fact, we could simply apply the model, tbl_regression
after nest_by
(replaced the gather
with pivot_longer
as gather
would get deprecated, pivot_longer
is a generalized version)
out <- df %>%
select(-id) %>%
pivot_longer(cols = starts_with('cond'),
names_to = 'condition', values_to = 'case') %>%
nest_by(condition) %>%
mutate(model = list(glm(case ~ gender + age +
race + health_score,
family = "binomial", data = data)),
table_out = list(tbl_regression(model, exponentiate = TRUE, conf.level = 0.99)))
out
# A tibble: 4 x 4
# Rowwise: condition
# condition data model table_out
# <chr> <list<tbl_df[,5]>> <list> <list>
#1 cond_a [2,000 × 5] <glm> <tbl_rgrs>
#2 cond_b [2,000 × 5] <glm> <tbl_rgrs>
#3 cond_c [2,000 × 5] <glm> <tbl_rgrs>
#4 cond_d [2,000 × 5] <glm> <tbl_rgrs>
If we need a merged table, apply the tbl_merge
on the list
of tbl_regression
tbl_merge(out$table_out)
-output