Concerning error bars, as far as I'm concerned, the most informative one is the 95% CI. That being said, I want to plot it for my proportions table. How do I calculate the 95% correctly for a proportion table? and plot it with ggplot2
?
Data contains the proportions of data (number of schools) collected per regions (A:E)
## calculate proportions:
region <- data %>% count(Q8) %>%
mutate(prop = round((prop.table(n) * 100), digits = 2), sd = round(sd(prop.table(n)),
digits = 2), Q8 = fct_reorder(Q8, n)) %>% arrange(n)
## output
> region
Q8 n prop sd
1 E 3 10.34 0.12
2 C 3 10.34 0.12
3 B 4 13.79 0.12
4 A 9 31.03 0.12
5 D 10 34.48 0.12
region_ci <- data.frame(DescTools::MultinomCI(region$n, conf.level = 0.95)) %>%
mutate_if(is.numeric, round, 2)
> region_ci
est lwr.ci upr.ci
1 0.10 0.00 0.29
2 0.10 0.00 0.29
3 0.14 0.00 0.32
4 0.31 0.14 0.49
5 0.34 0.17 0.53
region %>%
ggplot(aes(y = prop, x = ordered(Q8), fill = Q8)) +
geom_bar(stat = "identity", width = 0.3) +
geom_errorbar(aes(ymin= region_ci$lwr.ci, ymax= region_ci$upr.ci,
width= .1)) +
geom_text(aes(label = round(prop, 1.5)),
nudge_y = 2) + # so the labels don't hit the tops of the bars
labs(x = "place",
y = '(%)')
Question: It's clear that I've calculated the CIs wrong. How can I do that properly? and plot the correct error bars ? I've seen some similar posts, such as this one, but I'm still not sure on how to correctly calculate the CIs.
I've also tried the approach suggested here, but I've also got weird results. Thanks in adv.
data:
> dput(region)
structure(list(Q8 = structure(1:5, .Label = c("E", "C", "B",
"A", "D"), class = "factor"), n = c(3L, 3L, 4L, 9L, 10L), prop = c(10.34,
10.34, 13.79, 31.03, 34.48), sd = c(0.12, 0.12, 0.12, 0.12, 0.12
)), row.names = c(NA, -5L), class = "data.frame")]
Your code is correct. The only issue is that the prop values in the region data are the percentages (prop*100) but the CI values in the region_ci not. So in the ggplot multiply lower and upper ci values by 100 too:
region %>%
ggplot(aes(y = prop, x = ordered(Q8), fill = Q8)) +
geom_bar(stat = "identity", width = 0.3) +
geom_errorbar(aes(ymin= region_ci$lwr.ci*100, ymax= region_ci$upr.ci*100,
width= .1)) +
geom_text(aes(label = round(prop, 1.5)),
nudge_y = 2) + # so the labels don't hit the tops of the bars
labs(x = "place",
y = '(%)'