Search code examples
dplyrstatisticssummarygtsummaryancova

How to specify covariates in gtsummary ANCOVA


In a gtsummary table, I want to show the p-value for the difference in several variables between group A and group B, corrected for age and sex. I found here that I can use the ANCOVA test in gtsummary::add_p. However, I can't figure out how to specify my covariates "age" and "sex" in test_args.

In the following example, I want to calculate the p-value of the ancova for var1 and var2 (corrected for age and sex). Var3 is a variable with levels that I want to perform the chisq.test on.

library(dplyr)
library(gtsummary)

df <- data.frame(
  group = c(2, 1, 1, 2, 1, 2), 
  var1 = (rnorm(6,mean=10, sd=3)), 
  var2 = (rnorm(6,mean=6, sd=1)), 
  var3 = c(0, 4, 1, 3, 1, 1), 
  age = c(50, 32, 26, 46, 38, 62), 
  sex = c(1, 0, 1, 1, 1, 0))

df %>% 
tbl_summary(
by=group, 
type=list(c("var1":"age") ~ 'continuous', "sex" ~ 'categorical'),
statistic = list(c("var1", "var2", "age") ~ "{mean} ± {sd}", "var3" ~ "{median} ({p25}, {p75})")
) %>%
add_p(test=list(c("var1", "var2") ~ "ancova", "var3" ~ "chisq.test"), test_args = list(all_tests("ancova") ~ list(by = group, adj.vars = c("age", "sex"))))

I get the following error

'...' must be empty
Problematic argument: test_args = (...)

I found a similar question here, but it was not answered unfortunately. Any help is greatly appreciated!


Solution

  • To get adjusted results, you will need to use add_p() with a custom function for the p-value calculation. See below!

    library(gtsummary)
    packageVersion("gtsummary")
    #> [1] '1.7.2'
    set.seed(1234)
    
    df <- data.frame(
      group = c(2, 1, 1, 2, 1, 2), 
      var1 = (rnorm(6,mean=10, sd=3)), 
      var2 = (rnorm(6,mean=6, sd=1)), 
      var3 = c(0, 4, 1, 3, 1, 1), 
      age = c(50, 32, 26, 46, 38, 62), 
      sex = c(1, 0, 1, 1, 1, 0))
    
    my_ancova <- function(data, variable, by, ...) {
      aov(
        formula = reformulate(c(by, c("age", "sex")), response = variable),
        data = data
      ) |> 
        broom::tidy() |> 
        dplyr::slice_head(n = 1L) |> 
        dplyr::mutate(
          method = glue::glue("ANCOVA adjusted for age and sex")
        )
    }
    my_ancova(df, variable = "var1", by = "group")
    #> # A tibble: 1 × 7
    #>   term     df sumsq meansq statistic p.value method                         
    #>   <chr> <dbl> <dbl>  <dbl>     <dbl>   <dbl> <glue>                         
    #> 1 group     1  35.1   35.1      3.07   0.222 ANCOVA adjusted for age and sex
    aov(var1 ~ group + age + sex, data = df) |> broom::tidy()
    #> # A tibble: 4 × 6
    #>   term         df sumsq meansq statistic p.value
    #>   <chr>     <dbl> <dbl>  <dbl>     <dbl>   <dbl>
    #> 1 group         1 35.1   35.1     3.07     0.222
    #> 2 age           1 16.5   16.5     1.44     0.352
    #> 3 sex           1  1.08   1.08    0.0942   0.788
    #> 4 Residuals     2 22.8   11.4    NA       NA
    
    df |> 
      tbl_summary(
        include = c("var1", "var2"),
        by = group, 
        type= c("var1", "var2") ~ 'continuous',
        statistic = c("var1", "var2") ~ "{mean} ± {sd}"
      ) |> 
      add_p(test = everything() ~ my_ancova, pvalue_fun = ~style_pvalue(.x, digits = 3)) |> 
      as_kable()
    
    Characteristic 1, N = 3 2, N = 3 p-value
    var1 11.79 ± 1.29 6.95 ± 4.31 0.222
    var2 5.47 ± 0.05 5.18 ± 0.22 0.202

    Created on 2024-03-13 with reprex v2.1.0