Search code examples
rgtsummarygt

Order regression variables in table using gtsummary


Is it possible to order variables in a regression table using a gtsummary function? For example, let's say I have the following model:

model <- lm(formula = mpg ~ disp + drat + hp, data = mtcars) 

I would like to create a regression table with the independent variables in the following order: drat, then hp, and then disp. The following code would accomplish that goal:

library(gtsummary)
lm(formula = mpg ~ drat + hp + disp, data = mtcars) |>
  tbl_regression()

But, rather than re-running the regression, I would like to reorder the variables contained in the model object. Is there a way to do this using a gtsummary function (or some other post-estimation function)?


Solution

  • The easiest way to re-order the variables, as you mentioned, is to re-run the model with the variables in the order you like. But this can be time consuming with large models.

    In each gtsummary object there is a data frame called .$table_body. Essentially, gtsummary is a fancy print of this data frame and you can re-order the rows as you need.

    There is a column in the data frame called variable, and you can sort whichever variables you'd like to the top or bottom. Example below!

    library(gtsummary)
    #> #BlackLivesMatter
    library(dplyr)
    
    model <- lm(formula = mpg ~ disp + drat + hp, data = mtcars) 
    
    tbl_regression(model) %>%
      # re-order the variables in the table
      modify_table_body(
        ~.x %>%
          arrange(desc(variable == "hp")) %>%
          arrange(desc(variable == "drat"))
      ) %>%
      as_kable() # convert to kable so it'll display on stackoverflow
    
    Characteristic Beta 95% CI p-value
    drat 2.7 -0.33, 5.8 0.079
    hp -0.03 -0.06, 0.00 0.027
    disp -0.02 -0.04, 0.00 0.050

    Created on 2021-10-30 by the reprex package (v2.0.1)