Search code examples
rr-markdowntidyversegtsummary

Automate univariate and multivariable logistic models, return formatted results in R


I need to run several univariate and multivariable logistic regression models from the same dataset. I thus need to loop this to avoid duplicating the same code.

I hope to be able to clearly label my output tables, with a title so I can tell apart the different models within an RMarkdown PDF document e.g. "Univariate regression: Outcome = out1", the variable part being the "out1" (out1 - out3), similarly for multivariable models something like "Mutivariable regression: Outcome = out1" for out1 - out

I am using the gtsummary package so I can get nicely formatted results together with the accompanying footnotes.

I have tried the following but I have not succeeded. Shall appreciate any assistance.

# Libraries
library(gtsummary)
library(tidyverse)

# Data as well as a few artificial variables
data("iris")
my_iris <- as.data.frame(iris)

my_iris$out1 <- sample(c(0,1), 150, replace = TRUE)
my_iris$out2 <- sample(c(0,1), 150, replace = TRUE)
my_iris$out3 <- sample(c(0,1), 150, replace = TRUE)

my_iris$x1 <- sample(c(1:12), 150, replace = TRUE)
my_iris$x2 <- sample(c(50:100), 150, replace = TRUE)
my_iris$x3 <- sample(c(18:100), 150, replace = TRUE)


# This is the list of outcome variables I need to run univariate and multivariable logistic regressions for.
outcome <- c("out1", "out2", "out3")

# Univariate logistic models
for (out in seq_along(outcome)) {
my_iris %>% 
  dplyr::select(outcome[out], Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species) %>% 
  tbl_uvregression(method = glm,
                   y = outcome[out],
                   method.args = list(family = binomial),
                   exponentiate = TRUE) %>%
  bold_labels() 
}


# Multivariable logistic models
for (out in seq_along(outcome)) {
  tbl_regression(glm(outcome[out] ~ Species + Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, my_iris, family = binomial), exponentiate = TRUE)

}


Solution

  • Just put them into functions and then use any of the apply functions to loop it.

    uni_tbl_model <- function(my_iris, outcome) {
      model <- glm(my_iris[,outcome] ~ Sepal.Length, my_iris, family = binomial)
      tbl <- tbl_regression(model, exponentiate = TRUE)
      tbl <- tbl %>% modify_caption(paste("Univariate Regression Model with", outcome, "as Outcome", sep = " "))
      print(tbl)
    }
    
    multi_tbl_model <- function(my_iris, outcome) {
      model <- glm(my_iris[,outcome] ~ Species + Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, my_iris, family = binomial)
      tbl <- tbl_regression(model, exponentiate = TRUE)
      tbl <- tbl %>% modify_caption(paste("Multivariable Regression Model with", outcome, "as Outcome", sep = " "))
      print(tbl)
    }
    
    sapply(outcomes, function(outcome) uni_tbl_model(my_iris, outcome))
    sapply(outcomes, function(outcome) multi_tbl_model(my_iris, outcome))