I'm trying to create a tbl_summary
that has a strata category, and within each strata category, two seperate categorical (likely binary) variables. Here's an example of how I want the table to be laid out, however, the n/% for d
are placeholders and not true to my example dataset.
I don't want to combine variables b
and c
as these are distinct and independent variables per observation. I've attempted to achieve my desired table structure with a combination of tbl_summary
, tbl_strata
, and tbl_merge
, however I can't manage to get it working correctly.
Here's my minimal example (R v4.2.2
):
library(readr)
library(dplyr)
library(tidyverse)
library(gtsummary)
df <- data.frame(id=1:10,
a=c('red', 'blue', 'red', 'red', 'blue', 'red', 'blue', 'blue', 'blue', 'red'),
b=c('yes', 'yes', 'yes', 'no', 'yes', 'no', 'yes', 'yes', 'yes', 'no'),
c=c('cheese', 'cheese', 'steak', 'steak', 'cheese', 'steak', 'steak', 'cheese', 'steak', 'steak'),
d=c(22, 82, 44, 56, 27, 61, 22, 19, 38, 47)
)
df$a <- factor(df$a)
df$b <- factor(df$b)
df$c <- factor(df$c)
df$d <- factor(df$d)
t1 <- df %>%
select(a, b, d) %>%
mutate(a = paste("a=", a)) %>%
mutate(b = paste("b=", b)) %>%
tbl_strata(
strata = a,
.tbl_fun =
~ .x %>%
tbl_summary(by = b, missing = "no"),
.header = "**{strata}**, N = {n}"
)
t2 <- df %>%
select(a, c, d) %>%
mutate(a = paste("a=", a)) %>%
mutate(c = paste("c=", c)) %>%
tbl_strata(
strata = a,
.tbl_fun =
~ .x %>%
tbl_summary(by = c, missing = "no"),
.header = "**{strata}**, N = {n}"
)
tbl_merge(
tbls = list(t1, t2),
tab_spanner = c("**b**", "**c**")
)
This code produces this table, which doesn't have the correct columns for b
and c
, and is missing the overall strata variable a
.
Some further attempts have produced the right layout, but the N/% are incorrect and duplicated between the strata values:
df <- data.frame(id=1:11,
a=c('red', 'blue', 'red', 'red', 'blue', 'red', 'blue', 'blue', 'blue', 'red', 'blue'),
b=c('yes', 'yes', 'yes', 'no', 'yes', 'no', 'yes', 'yes', 'yes', 'no', 'no'),
x=c('cheese', 'cheese', 'steak', 'steak', 'cheese', 'steak', 'steak', 'cheese', 'steak', 'steak', 'cheese'),
d=c(22, 82, 44, 56, 27, 61, 22, 19, 38, 47, 38)
)
df$a <- factor(df$a)
df$b <- factor(df$b)
df$x <- factor(df$x)
df$d <- factor(df$d)
t3 <- df %>%
select(b, d) %>%
mutate(b = paste("b=", b)) %>%
tbl_summary(by = b,
missing = "no"
)
t4 <- df %>%
select(x, d) %>%
mutate(x = paste("x=", x)) %>%
tbl_summary(by = x,
missing = "no"
)
df %>% tbl_strata(
strata = a,
.tbl_fun =
~tbl_merge(
tbls = list(t3, t4)
),
.header = "**a={strata}**, N = {n}"
)
I moved the tbl_strata
from each table to instead happen on the merged version, and passed the tbl_merge
(which isn't ~ .x %>%
) to tbl_strata
.
This is close, if I can fix the values for d
which are incorrectly duplicated between values of a
.
Hope this is what you're after!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.7.2'
df <- data.frame(id=1:10,
a=c('red', 'blue', 'red', 'red', 'blue', 'red', 'blue', 'blue', 'blue', 'red'),
b=c('yes', 'yes', 'yes', 'no', 'yes', 'no', 'yes', 'yes', 'yes', 'no'),
c=c('cheese', 'cheese', 'steak', 'steak', 'cheese', 'steak', 'steak', 'cheese', 'steak', 'steak'),
d=c(22, 82, 44, 56, 27, 61, 22, 19, 38, 47)
)
df$a <- factor(df$a)
df$b <- factor(df$b)
df$c <- factor(df$c)
df$d <- factor(df$d)
# first create a function to create half of the table
tbl_summary_half_merge <- function(data, by, include) {
purrr::map(
by,
~tbl_summary(data, by = all_of(.x), include = all_of(include)) |>
modify_header(all_stat_cols() ~ paste0("**", .x, " = {level}**"))
) |>
tbl_merge(tab_spanner = FALSE)
}
# testing our first function
tbl_summary_half_merge(df, by = c("b", "c"), include = "d") |> as_kable()
Characteristic | b = no | b = yes | c = cheese | c = steak |
---|---|---|---|---|
d | ||||
19 | 0 (0%) | 1 (14%) | 1 (25%) | 0 (0%) |
22 | 0 (0%) | 2 (29%) | 1 (25%) | 1 (17%) |
27 | 0 (0%) | 1 (14%) | 1 (25%) | 0 (0%) |
38 | 0 (0%) | 1 (14%) | 0 (0%) | 1 (17%) |
44 | 0 (0%) | 1 (14%) | 0 (0%) | 1 (17%) |
47 | 1 (33%) | 0 (0%) | 0 (0%) | 1 (17%) |
56 | 1 (33%) | 0 (0%) | 0 (0%) | 1 (17%) |
61 | 1 (33%) | 0 (0%) | 0 (0%) | 1 (17%) |
82 | 0 (0%) | 1 (14%) | 1 (25%) | 0 (0%) |
# now use that function with tbl_strata()
tbl <-
tbl_strata(
df,
strata = "a",
.tbl_fun =
~tbl_summary_half_merge(.x, by = c("b", "c"), include = "d"),
.header = "**a = {strata}**"
)
Created on 2024-04-30 with reprex v2.1.0