Search code examples
rggplot2reshapegeom-bar

Graph stack bar for multiple variables with wrong percentages


I am trying to get a graph stack bar for multiples variables to display percentages for each level in each variable, example:

Category  <- c(rep(c("A", "B", "C", "D"), times = 4))
Country  <- c(rep(c("Country A", "Country B", "Country C", "Country D", "Country F", "Country G"), times = 16))
Data      <- data.frame(Category, Country)

ggplot(Data,aes(x=factor(""),fill=factor(Category)))+
  geom_bar(position="fill")+
  geom_text(aes(label=scales::percent(..count../sum(..count..))), stat='count',position=position_fill(vjust=0.5))

percentages ok

enter image description here

Now I reshape my data frame, but lamentably I could not get the result expected, graph drisplay wrong percentages:

library(tidyr)
dflong= gather(Data, variable, freq)

ggplot(dflong,aes(x=factor(variable),fill=factor(freq)))+
  geom_bar(aes(x=factor(variable)),position="fill")+
  geom_text(aes(label=scales::percent(..count../sum(..count..))), stat='count',position=position_fill(vjust=0.5))

wrong percentages

enter image description here


Solution

  • You can pre-calculate the percentages before plotting.

    library(dplyr)
    library(ggplot2)
    
    dflong %>%
      count(variable, freq, name = 'percentage') %>%
      group_by(variable) %>%
      mutate(percentage = prop.table(percentage) * 100) %>%
      ggplot() + aes(variable, percentage, fill = freq, 
                     label = paste0(round(percentage,2), '%')) + 
      geom_col() + 
      geom_text(position=position_stack(vjust=0.5))
    

    enter image description here