Search code examples
rlogistic-regression

Limit tbl_uvregression() output based on p-value threshold


I am running a univariate logistic regression for a model feature selection and want to be able to only display significant variables in a chart. Is there an argument within tbl_uvregression() that would allow me to only display variable that have a p-value < 0.05. Code below:

tbl <- df %>%
  tbl_uvregression(
    method = glm,
    y = diabetes,
    exponentiate = TRUE 
  )%>%
  add_global_p() %>%
  bold_p()

I tried to use the include argument to manually set the variables that were bolded (so as to signify variables with a p-value < 0.05). Surely there's a more efficient way.


Solution

  • You can modify the table in returned tbl_uvregression object, either directly through tbl$table_body or by using gtsummary::modify_table_body().
    Or just use gtsummary::filter_p() :

    library(gtsummary)
    
    tbl <- mtcars[6:11] %>% 
      tbl_uvregression(
        method = glm,
        y = vs,
        method.args = list(family = binomial),
        exponentiate = TRUE
      ) %>% 
      add_global_p() %>%
      bold_p() 
    
    tbl %>% 
      as_kable(format = "pipe")
    
    Characteristic N OR 95% CI p-value
    wt 32 0.15 0.03, 0.49 <0.001
    qsec 32 22.7 4.05, 582 <0.001
    am 32 2.00 0.48, 8.76 0.3
    gear 32 1.79 0.68, 5.14 0.2
    carb 32 0.27 0.09, 0.58 <0.001
    tbl %>% 
      filter_p() %>% 
      as_kable(format = "pipe")
    
    Characteristic N OR 95% CI p-value
    wt 32 0.15 0.03, 0.49 <0.001
    qsec 32 22.7 4.05, 582 <0.001
    carb 32 0.27 0.09, 0.58 <0.001

    Created on 2023-09-02 with reprex v2.0.2