Search code examples
rgtsummarygt

gtsummary help repeated variable


enter image description here

df %>% 
    tbl_summary(by=Field)

It seems when there are repetitions in some variables, this package gives strange result. It has given me the n(%) of each variable instead of median(IQR) and p values. Is there a way to get around this?

I tried everything possible.


Solution

  • The issue is that by default tbl_summary uses the categorical summary type for numerics with less than 10 unique levels. From the docs ?tbl_summary:

    ... numeric variables with fewer than 10 unique levels default to type categorical.

    To fix that you have to explicitly set the type to be continuous:

    To change a numeric variable to continuous that defaulted to categorical, use type = list(varname ~ "continuous")

    Using some fake example data:

    library(gtsummary)
    
    set.seed(123)
    
    df <- data.frame(
      Alphonso = sample(1:6, 100, replace = TRUE),
      Field = sample(c("field1", "field2"), 100, replace = TRUE)
    )
    
    df %>% 
      tbl_summary(by=Field)
    

    enter image description here

    df %>% 
      tbl_summary(by=Field, type = list(Alphonso ~ "continuous"))
    

    enter image description here