I have this data frame
df.bar <- data.frame(diagnosis = c("A","A","A", "nb" ,"nb", "hg"),
C1 = c(1,1,0,0,1,0), C2 = c(0,1,0,0,0,0))
df.bar
diagnosis C1 C2
1 A 1 0
2 A 1 1
3 A 0 0
4 nb 0 0
5 nb 1 0
6 hg 0 0
I want to calculate the percentage of "one" for each diagnosis as follows:
diagnosis C1 C2
1 A 66% 33%
2 nb 50% 0%
3 hg 0% 0%
base
solution with aggregate()
:aggregate(cbind(C1, C2) ~ diagnosis, df.bar,
\(x) paste0(round(mean(x) * 100, 2), '%'))
dplyr
solution:library(dplyr)
df.bar %>%
group_by(diagnosis) %>%
summarise(across(C1:C2, ~ paste0(round(mean(.x) * 100, 2), '%')))
# # A tibble: 3 × 3
# diagnosis C1 C2
# <chr> <chr> <chr>
# 1 A 66.67% 33.33%
# 2 hg 0% 0%
# 3 nb 50% 0%