I want to cross tabulate member and author in the rows and review, publish and pay in the column showing row and column total with percentages in bracket and chi-square test in the footnote.
#data
set.seed(123)
member <- sample(c("Yes", "No"), 100, replace = TRUE)
author <- sample(c("Yes", "No"), 100, replace = TRUE)
review <- sample(0:10, 100, replace = TRUE)
publish <- sample(0:10, 100, replace = TRUE)
pay <- sample(0:10, 100, replace = TRUE)
data <- data.frame(member, author, review, publish, pay)
But I recently found out about gtsummary which will produce the result I want but I'm struggling to replicate the result - so far with the tidy code I have this: I want review, publish and pay to be grouped by No (0-4), Maybe (5) and Yes (6-10) as shown in the code below. So far I have used tidyverse:
data |>
group_by(member)|>
summarise(
Disagree = sum(review<5),
Neutral = sum(review==5),
Agree = sum(review>5))|>
kbl(caption = "Review by member") %>%
kable_paper("hover",full_width = F,html_font = "Cambria")
fisher.test(table(data$member, data$review),simulate.p.value = T)
Thanks for your help. I could not post the image because I need 10 reputation (I don't know what that means)
The preferred output is have review, publish and pay has three columns with groups No, Maybe, Yes.
Update: We could add use tbl_split(., c(author, review_group, publish_group, pay_group))
to the code:
Here you will get 4 separate tables that you could put side by side:
library(dplyr)
library(gtsummary)
data %>%
mutate(across(c(review, publish, pay), ~cut(., breaks = c(-Inf, 4.5, 5.5, Inf),
labels = c("No", "Maybe", "Yes"),
include.lowest = TRUE), .names = "{.col}_group")) %>%
select(member, author, ends_with("group")) %>%
tbl_summary(
by = member,
missing = "no",
statistic = list(all_categorical() ~ "{n} ({p}%)"),
digits = list(all_categorical() ~ c(0, 1))
) %>%
add_p(test = all_categorical() ~ "chisq.test") %>%
tbl_split(., c(author, review_group, publish_group, pay_group))
First answer: We could do it this way:
library(dplyr)
library(gtsummary)
data %>%
mutate(across(c(review, publish, pay), ~cut(., breaks = c(-Inf, 4.5, 5.5, Inf),
labels = c("No", "Maybe", "Yes"),
include.lowest = TRUE), .names = "{.col}_group")) %>%
select(member, author, ends_with("group")) %>%
tbl_summary(
by = member,
missing = "no",
statistic = list(all_categorical() ~ "{n} ({p}%)"),
digits = list(all_categorical() ~ c(0, 1))
) %>%
add_p(test = all_categorical() ~ "chisq.test")