Search code examples
rcategorical-datastacked-chartstackedbarseries

How to create a percentage stack plot for unequal group size in R?


I have 2 groups namely "A" and "B" of unequal sample size. The size of "A" is 19 and the size of "B" is 15 total size of data being 34. They have a categorical variable named "Drug1" to indicate how many people of group A and B takes that drug so the values of the "Drug1" variables are YES and NO.

How do I create a percentage stack plot where I can show both groups and the number of people of each who take the drug and who do not take the drug in percentage stacked over each other? I would also like to annotate the percentage value in the bar

This is a sample of how the input looks like:

dat1<- data.frame(id=1:n,
Group=sample(c("A", "B"), 6, replace = TRUE) , 
Drug1=sample(c("Yes", "No"), 6, replace = TRUE)) 

Solution

  • I think this is what you want:

    library(ggplot2)
    
    dataset <- data.frame(
      Groups = as.factor(sample(c("A", "B"), 30, replace = T)),
      Drug_1 = as.factor(sample(c("Yes", "No"), 30, replace = T))
    )
    
    
    df_tbl <- table(dataset)
    
    df_tbl <- round(100 * prop.table(df_tbl, margin = 1), 2)
    
    df_tbl <- as.data.frame(df_tbl)
    
    ggplot(df_tbl, aes(fill = Drug_1, y = Freq, x = Groups)) +
      geom_bar(position = "stack", stat = "identity") + 
      labs(y = "Percent") +
      geom_text(aes(label = Freq), position = "stack", vjust = 2)
    
    

    enter image description here