Search code examples
rggplot2labelgeom-bargeom-text

How to show labels in geom_text that is proportional to geom_bar group variable


I have been trying to output in ggplot a graph that shows labels in percentage value and in proportion to the grouping factor defined in geom_bar. Instead of % values proportionate to the overall population, I would like to output a label value that is proportionate to each sub-group (in this case Place A and Place B) but I have not managed to. See below the reproducible example

Reproducible dataframe

Random<-data.frame(replicate(3,sample(0:3,3024,rep=TRUE)))
Random$Trxn_type <- sample(c("Debit", "Credit"),
                       size = nrow(Random), 
                       prob = c(0.76, 0.24), replace = TRUE)
Random$YN <- sample(c("Yes", "No"),
                       size = nrow(Random), 
                       prob = c(0.76, 0.24), replace = TRUE)
Random$Place <- sample(c("PlaceA", "PlaceB"),
                       size = nrow(Random), 
                       prob = c(0.76, 0.24), replace = TRUE)

Random<-Random[, 4:6]

Then applied the following code

Share<-ggplot(Random, aes(x = YN, fill=Place)) +
scale_fill_brewer(palette="Greens")+
geom_bar(aes(y = ..prop.., group = Place),position = position_dodge()) + 
facet_wrap(~ Random$Trxn_type, scales = "free_x", ncol=2)+ 
theme(strip.text.x = element_text(size = 15, colour = "black"))+
theme(panel.background = element_rect(fill = "white"),legend.position = "bottom")+
scale_y_continuous(labels = percent)+
ylab("Frequency") + 
coord_flip()+ 
xlab("Answers") + 
theme(plot.title = element_text(size = 16, face = "bold"),
      axis.text=element_text(size=12),
      axis.title=element_text(size=12))+
geom_text(aes(y=..prop..,label=scales::percent((..count..)/tapply(..count..,..PANEL..,sum)[..PANEL..])),
          stat="count", vjust=-.5, position=position_dodge(.9)) 
Share

And got the following output

enter image description here

Instead of this percentage distribution I would like to see the % value of replies considering Place A and Place B as two separate populations. Put it more simply I would like the labels to show the % value corresponding to the size of the histogram bars in a way that histograms for Place A in credit to sum up to 100 and histograms for Place B in credit to sum up to 100. The same would apply to debit.

Thanks!


Solution

  • Here is a solution that computes the proportions with dplyr and then pipes the result to ggplot.
    I have also put all theme settings in the same call to theme().
    I have reposted the data creation code, this time setting the RNG seed in order to make the data example reproducible.

    library(dplyr)
    library(ggplot2)
    
    Random %>%
      count(Trxn_type, YN, Place) %>%
      left_join(Random %>% count(Trxn_type, name = "m"), by = "Trxn_type") %>%
      mutate(Prop = n/m) %>%
      ggplot(aes(x = YN, y = Prop, fill = Place)) +
      geom_col(position = position_dodge()) +
      geom_text(aes(label = scales::percent(Prop)),
                hjust = -0.25, 
                position = position_dodge(0.9)) +
      facet_wrap(~ Trxn_type, scales = "free_x", ncol = 2) +
      scale_fill_brewer(palette = "Greens") +
      scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
      xlab("Answers") +
      ylab("Frequency") +
      coord_flip() +
      theme(panel.background = element_rect(fill = "white"),
            legend.position = "bottom",
            strip.text.x = element_text(size = 15, colour = "black"),
            plot.title = element_text(size = 16, face = "bold"),
            axis.text = element_text(size = 12),
            axis.title = element_text(size = 12))
    

    enter image description here

    Edit.

    Following the OP's comment, here is a way to also count by Place. The only change to the code above is the left_join instruction.

      left_join(Random %>% count(Trxn_type, Place, name = "m"),
                by = c("Trxn_type", "Place")) %>%
    

    enter image description here

    Data creation code.

    set.seed(1234)
    Random <- data.frame(replicate(3,sample(0:3,3024,rep=TRUE)))
    Random$Trxn_type <- sample(c("Debit", "Credit"),
                               size = nrow(Random),
                               prob = c(0.76, 0.24), replace = TRUE)
    Random$YN <- sample(c("Yes", "No"),
                        size = nrow(Random),
                        prob = c(0.76, 0.24), replace = TRUE)
    Random$Place <- sample(c("PlaceA", "PlaceB"),
                           size = nrow(Random),
                           prob = c(0.76, 0.24), replace = TRUE)
    
    Random <- Random[, 4:6]