I am only allowed to output data rounded to the nearest 5. I have figured out how to do it for the categorical data rows, however the heading and percentages are unchanged.
Example code:
mpg %>%
select(manufacturer, drv) %>%
tbl_summary(by = drv,
digits=list(all_categorical() ~ c(function(x){round(x/5) * 5}, 0)),
type = list(displ = "continuous2"))
I want, instead of (e.g.) N=103 to display N=105. I'd also like the percentages to correspond to the rounded values.
Thanks!
@Edward's response is great! If you want to update your table to have percentages that are calculated based on the rounded n's in the table, you'll need to take a step back from tbl_summary()
. The tbl_summary()
function uses the {cards} package to perform all tabulations, and that is where we'll need to go to modify the method that the percentage is calculated.
In the example below, we first calculate an analysis result dataset (ARD) using cards, then pass that ARD to tbl_ard_summary()
to build the table. It's a bit complex, but I am not sure of a simpler way!
library(gtsummary)
library(cards)
ard <-
# calculate counts and big N
ard_categorical(
ggplot2::mpg,
variables = manufacturer,
by = drv,
statistic = ~c("n", "N"),
# round little n to the nearest 5
fmt_fn = ~list(n = function(x) round(x / 5) * 5)
) |>
# add the percentage using n rounded to the nearest 5
cards::add_calculated_row(
expr = round(n / 5) * 5 / N,
stat_name = "p",
stat_label = "%",
fmt_fn = label_style_percent()
) |>
# add calculations for the continuous summaries
bind_ard(
ard_stack(
ggplot2::mpg,
.by = drv,
ard_continuous(variables = displ)
)
)
print(ard, n = 3) # only print the first three rows
#> {cards} data frame: 168 x 11
#> group1 group1_level variable variable_level stat_name stat_label stat
#> 1 drv 4 manufacturer audi n n 11
#> 2 drv 4 manufacturer audi N N 103
#> 3 drv f manufacturer audi n n 7
#> ℹ 165 more rows
#> ℹ Use `print(n = ...)` to see more rows
#> ℹ 4 more variables: context, fmt_fn, warning, error
# use the ARD-first workflow to create the table
ard |>
tbl_ard_summary(
by = drv,
statistic = all_categorical() ~ "{n} / {N} ({p}%)" # showin the big N to illustrate the percentage calculation is correct
) |>
modify_header(all_stat_cols() ~ "**{level}** \n N = {round(n/5)*5}") |>
# print table as kable so it renders on stackoverflow
bold_labels() |>
as_kable()
Characteristic | 4 N = 105 | f N = 105 | r N = 25 |
---|---|---|---|
manufacturer | |||
audi | 10 / 103 (9.7%) | 5 / 106 (4.7%) | 0 / 25 (0%) |
chevrolet | 5 / 103 (4.9%) | 5 / 106 (4.7%) | 10 / 25 (40%) |
dodge | 25 / 103 (24%) | 10 / 106 (9.4%) | 0 / 25 (0%) |
ford | 15 / 103 (15%) | 0 / 106 (0%) | 10 / 25 (40%) |
honda | 0 / 103 (0%) | 10 / 106 (9.4%) | 0 / 25 (0%) |
hyundai | 0 / 103 (0%) | 15 / 106 (14%) | 0 / 25 (0%) |
jeep | 10 / 103 (9.7%) | 0 / 106 (0%) | 0 / 25 (0%) |
land rover | 5 / 103 (4.9%) | 0 / 106 (0%) | 0 / 25 (0%) |
lincoln | 0 / 103 (0%) | 0 / 106 (0%) | 5 / 25 (20%) |
mercury | 5 / 103 (4.9%) | 0 / 106 (0%) | 0 / 25 (0%) |
nissan | 5 / 103 (4.9%) | 10 / 106 (9.4%) | 0 / 25 (0%) |
pontiac | 0 / 103 (0%) | 5 / 106 (4.7%) | 0 / 25 (0%) |
subaru | 15 / 103 (15%) | 0 / 106 (0%) | 0 / 25 (0%) |
toyota | 15 / 103 (15%) | 20 / 106 (19%) | 0 / 25 (0%) |
volkswagen | 0 / 103 (0%) | 25 / 106 (24%) | 0 / 25 (0%) |
displ | 4.0 (2.8, 4.7) | 2.4 (2.0, 3.0) | 5.4 (4.6, 5.7) |
Created on 2024-12-12 with reprex v2.1.1