dplyr statistics summary gtsummary ancova

How to specify covariates in gtsummary ANCOVA

In a gtsummary table, I want to show the p-value for the difference in several variables between group A and group B, corrected for age and sex. I found here that I can use the ANCOVA test in gtsummary::add_p. However, I can't figure out how to specify my covariates "age" and "sex" in test_args.

In the following example, I want to calculate the p-value of the ancova for var1 and var2 (corrected for age and sex). Var3 is a variable with levels that I want to perform the chisq.test on.

library(dplyr)
library(gtsummary)

df <- data.frame(
  group = c(2, 1, 1, 2, 1, 2), 
  var1 = (rnorm(6,mean=10, sd=3)), 
  var2 = (rnorm(6,mean=6, sd=1)), 
  var3 = c(0, 4, 1, 3, 1, 1), 
  age = c(50, 32, 26, 46, 38, 62), 
  sex = c(1, 0, 1, 1, 1, 0))

df %>% 
tbl_summary(
by=group, 
type=list(c("var1":"age") ~ 'continuous', "sex" ~ 'categorical'),
statistic = list(c("var1", "var2", "age") ~ "{mean} ± {sd}", "var3" ~ "{median} ({p25}, {p75})")
) %>%
add_p(test=list(c("var1", "var2") ~ "ancova", "var3" ~ "chisq.test"), test_args = list(all_tests("ancova") ~ list(by = group, adj.vars = c("age", "sex"))))

I get the following error

'...' must be empty
Problematic argument: test_args = (...)

I found a similar question here, but it was not answered unfortunately. Any help is greatly appreciated!

Solution

To get adjusted results, you will need to use add_p() with a custom function for the p-value calculation. See below!

library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.7.2'
set.seed(1234)

df <- data.frame(
  group = c(2, 1, 1, 2, 1, 2), 
  var1 = (rnorm(6,mean=10, sd=3)), 
  var2 = (rnorm(6,mean=6, sd=1)), 
  var3 = c(0, 4, 1, 3, 1, 1), 
  age = c(50, 32, 26, 46, 38, 62), 
  sex = c(1, 0, 1, 1, 1, 0))

my_ancova <- function(data, variable, by, ...) {
  aov(
    formula = reformulate(c(by, c("age", "sex")), response = variable),
    data = data
  ) |> 
    broom::tidy() |> 
    dplyr::slice_head(n = 1L) |> 
    dplyr::mutate(
      method = glue::glue("ANCOVA adjusted for age and sex")
    )
}
my_ancova(df, variable = "var1", by = "group")
#> # A tibble: 1 × 7
#>   term     df sumsq meansq statistic p.value method                         
#>   <chr> <dbl> <dbl>  <dbl>     <dbl>   <dbl> <glue>                         
#> 1 group     1  35.1   35.1      3.07   0.222 ANCOVA adjusted for age and sex
aov(var1 ~ group + age + sex, data = df) |> broom::tidy()
#> # A tibble: 4 × 6
#>   term         df sumsq meansq statistic p.value
#>   <chr>     <dbl> <dbl>  <dbl>     <dbl>   <dbl>
#> 1 group         1 35.1   35.1     3.07     0.222
#> 2 age           1 16.5   16.5     1.44     0.352
#> 3 sex           1  1.08   1.08    0.0942   0.788
#> 4 Residuals     2 22.8   11.4    NA       NA

df |> 
  tbl_summary(
    include = c("var1", "var2"),
    by = group, 
    type= c("var1", "var2") ~ 'continuous',
    statistic = c("var1", "var2") ~ "{mean} ± {sd}"
  ) |> 
  add_p(test = everything() ~ my_ancova, pvalue_fun = ~style_pvalue(.x, digits = 3)) |> 
  as_kable()

Characteristic	1, N = 3	2, N = 3	p-value
var1	11.79 ± 1.29	6.95 ± 4.31	0.222
var2	5.47 ± 0.05	5.18 ± 0.22	0.202

^{Created on 2024-03-13 with reprex v2.1.0}