I am trying to build on the guidance provided here: [Tests/methods available in add_p() and add_difference()][1] specifically under the section "Custom Functions"
by passing a customized function in add_p()
function. My goal is to have the p-values from an ANOVA test while incorporating survey design effect. The code am using is
rm(list = ls())
library(gtsummary)
library(dplyr)
library(srvyr)
library(survey)
data <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25),
strata = c(10, 20, 30, 10, 20, 20, 10, 20, 30, 30, 10, 30, 30, 20, 10, 20, 20, 20, 10, 20, 20, 30, 30, 20, 30),
weight = c(10, 8, 17, 15, 9, 10, 25, 8, 8, 13, 17, 24, 12, 15, 3, 12, 16, 17, 24, 12, 3, 2, 8, 14, 4),
popgroup = c("A", "B", "A", "C", "A", "C", "B", "B", "B", "A", "A", "C", "A", "B", "A", "C", "B", "A", "C", "B", "A", "B", "C", "B", "B"),
gender = c("Male", "Female", "Female", "Male", "Female", "Female", "Male", "Male", "Female", "Male", "Female", "Female", "Male", "Female", "Male", "Female", "Female", "Male", "Female", "Female", "Male", "Female", "Male", "Female", "Male"),
inc_01 = c(1500, 1200, 130, 500, 750, 2000, 10000, 1500, 1050, 400, 360, 490, 250, 400, 2500, 1300, 800, 540, 690, 520, 600, 700, 700, 600, 400),
inc_02 = c(360, 450, 120, 300, 900, 560, 450, 280, 720, 360, 1000, 900, 530, 820, 640, 520, 130, 140, 150, 650, 240, 130, 200, 300, 500)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -25L))
dclus2 <- survey::svydesign(
id = ~ id,
strata = ~ strata,
weights = ~ weight,
data = data
)
ttest_common_variance <- function(data, variable = inc_01, by = popgroup, ...) {
model<-svyglm(variable ~ by, design=dclus2)
reg_v <- regTermTest(model,~ by)
reg_v1 <- as.numeric(reg_v$p)
reg_v2 <- tibble( p.value = reg_v1 )
}
tbl_2 <-data |>
as_survey_design(strata = strata, weights = weight) %>%
gtsummary::tbl_svysummary(
by = popgroup,
type = where(is.numeric) ~ "continuous",
statistic = list(c(inc_01, inc_02) ~ "{mean} {mean.std.error}"),
missing = "no",
digits = list(c(inc_01, inc_02) ~ c(4)),
include = c(inc_01, inc_02)
) |>
gtsummary::add_p(
test = c(inc_01, inc_02) ~ "ttest_common_variance",
pvalue_fun = function(x) gtsummary::style_pvalue(x, digits = 3)
) |>
as_tibble()
but am getting the error
There was an error in 'add_p()/add_difference()' for variable 'inc_01', p-value omitted: Error in svyglm.survey.design(variable ~ by, design = dclus2, family = quasibinomial()): all variables must be in design= argument
Any help will be appreciated. Thanks in advance [1]: https://www.danieldsjoberg.com/gtsummary/reference/tests.html
You have several issues in this code. In the order that errors appear:
Error in svyglm.survey.design(variable ~ by, design = dclus2): all variables must be in design= argument
This is quite self-explanatory, any variables that go in your model must be in the svydesign
object.
dclus2 <- survey::svydesign(
id = ~ id,
strata = ~ strata,
weights = ~ weight,
## Include variables measured.
variables = ~ inc_01 + inc_02 + popgroup,
data = data
)
All remaining problems are with your ttest_common_variance
function:
variable
and by
but these are not existing objects, or rather, you're not actually creating a formula out of them.tibble
, but gtsummary
expects a scalar. regTermTest
returns a 1x1 matrix, so we can just take its first element.Here's a version that should work:
ttest_common_variance <- function(data, variable = "inc_01", by = "popgroup", ...) {
model <- svyglm(formula(paste0(variable, "~", by)), design=dclus2)
regTermTest(model, formula(paste0("~", by)))$p[1]
}
... pipeline skipped
#> `**Character**` `**A**, N = 101` `**B**, N = 112` `**C**, N = 93` `**p-value**`
#> <chr> <chr> <chr> <chr> <chr>
#> inc_01 561.9802 137.7631 2,825.3571 1,669.4621 828.1720 169.7938 0.305
#> inc_02 463.3663 130.1409 459.7321 80.4854 463.8710 143.0381 >0.999