Search code examples
rsummarygtsummarytbl

R gtsummary package: How to Manipulate / Hide Rows in Summary Table


I am working on a project with gtsummary. For one of the tables, I have to build a long table listing covariables before and after the matchit process.

My issue is that for all of the covariables (Obesity, for example), it reads one row, Obesity, then next row, Obese, and then the next, Not Obese. That is three tables for which I wish to only show one: Diabetes N (%).

I have tried editing dichotomous variables, introducing Null, trying to find a row_hide function, but to no avail.

Here is my code:

Creation of trial

trialCAS1 <- index_CAS %>%
select(TopDecile, Gender, Obesity, Diabetes, Diabetes_Complex, etc)

Tbl summary

CAStable1 <- tbl_summary(trialCAS1, 
by = TopDecile,
missing = "no") %>%
add_n() %>%
modify_header(label = "**Variable**") %>%
bold_labels()

I included the first table I get.

image


Solution

  • The tbl_summary() function tries its best to guess the type of data passed (categorical, dichotomous, and continuous). It doesn't always guess what we'd like to see, but the default can always be changed using arguments in tbl_summary()! I'll use the trial data set in the {gtsummary} package as an example.

    Here is the default output:

    library(gtsummary)
    trial %>%
      select(trt, grade, stage) %>%
      tbl_summary(by = trt)
    

    enter image description here

    By default, the summary statistics for grade and stage are shown on multiple rows. Imagine, however, we are only interested in the rate of Grade I disease and the rate of Stage T1 cancer. We can use the tbl_summary(value=) argument to specify these are the only values we want displayed (which will then default to print these as dichotomous variables). In the example below, I have also updated the label displayed to indicate these are Grade I and Stage T1 rates only.

    trial %>%
      select(trt, grade, stage) %>% 
      tbl_summary(
        by = trt,
        value = list(grade ~ "I",
                     stage ~ "T1"),
        label = list(grade ~ "Grade I",
                     stage ~ "Stage T1")
      ) 
    

    enter image description here

    Based on what I see from your code and output, I think this code will work for you on your data set:

    tbl_summary(
      trialCAS1, 
      by = TopDecile,
      missing = "no".
      value = Obesity ~ "Obese",
      label = Obesity ~ "Obese"
    )