r ggplot2 bar-chart visualization confidence-interval

plot 95% CI for proportions tables in ggplot2

Concerning error bars, as far as I'm concerned, the most informative one is the 95% CI. That being said, I want to plot it for my proportions table. How do I calculate the 95% correctly for a proportion table? and plot it with ggplot2 ?

data:

Data contains the proportions of data (number of schools) collected per regions (A:E)

## calculate proportions:

region <- data %>% count(Q8) %>% 
  mutate(prop = round((prop.table(n) * 100), digits = 2), sd = round(sd(prop.table(n)), 
  digits = 2), Q8 = fct_reorder(Q8, n)) %>% arrange(n) 

## output
> region
  Q8  n  prop   sd
1  E  3 10.34 0.12
2  C  3 10.34 0.12
3  B  4 13.79 0.12
4  A  9 31.03 0.12
5  D 10 34.48 0.12

Cool. Now I need to calculate the 95% CI. I've tried:

region_ci  <- data.frame(DescTools::MultinomCI(region$n, conf.level = 0.95)) %>%
               mutate_if(is.numeric, round, 2)
 
> region_ci
   est lwr.ci upr.ci
1 0.10   0.00   0.29
2 0.10   0.00   0.29
3 0.14   0.00   0.32
4 0.31   0.14   0.49
5 0.34   0.17   0.53

Now I'd like to plot the proportions with error bars. My attempt:

 region %>% 
  ggplot(aes(y = prop, x = ordered(Q8), fill = Q8)) + 
  geom_bar(stat = "identity", width = 0.3) +
  geom_errorbar(aes(ymin= region_ci$lwr.ci, ymax= region_ci$upr.ci, 
                    width= .1)) + 
  geom_text(aes(label = round(prop, 1.5)),
            nudge_y = 2) + # so the labels don't hit the tops of the bars
  labs(x = "place",
       y = '(%)')

which gives me this:

Question: It's clear that I've calculated the CIs wrong. How can I do that properly? and plot the correct error bars ? I've seen some similar posts, such as this one, but I'm still not sure on how to correctly calculate the CIs.
I've also tried the approach suggested here, but I've also got weird results. Thanks in adv.
data:

> dput(region)
structure(list(Q8 = structure(1:5, .Label = c("E", "C", "B", 
"A", "D"), class = "factor"), n = c(3L, 3L, 4L, 9L, 10L), prop = c(10.34, 
10.34, 13.79, 31.03, 34.48), sd = c(0.12, 0.12, 0.12, 0.12, 0.12
)), row.names = c(NA, -5L), class = "data.frame")]

Solution

Your code is correct. The only issue is that the prop values in the region data are the percentages (prop*100) but the CI values in the region_ci not. So in the ggplot multiply lower and upper ci values by 100 too:

 region %>% 
  ggplot(aes(y = prop, x = ordered(Q8), fill = Q8)) + 
  geom_bar(stat = "identity", width = 0.3) +
  geom_errorbar(aes(ymin= region_ci$lwr.ci*100, ymax= region_ci$upr.ci*100, 
                    width= .1)) + 
  geom_text(aes(label = round(prop, 1.5)),
            nudge_y = 2) + # so the labels don't hit the tops of the bars
  labs(x = "place",
       y = '(%)'

graph output