Search code examples
rggplot2geom-text

Ggplot2: geom_text() adds additional, unwanted values to plot


I have a seemingly simple problem that I can't solve: too many values appear on my plot. I only want to see the total count (tot_q which is n) once, and the relevant pc (percentage for categories where quality is 1). Here is my example code:

category <- as.factor(c(1, 2, 3, 3, 2, 2, 1, 2, 4, 4, 1, 3, 2, 2, 2, 1))
quality <- as.factor(c(0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1))
mydata <- data.frame(category, quality)

mydata2<- mydata %>% group_by(category,quality) %>% mutate(count_q = n()) %>%
  group_by(category) %>% mutate(tot_q=n(),pc=count_q*100/tot_q)

myplot <- ggplot(mydata2, aes(x= category, y = pc)) +
  geom_bar(position = 'dodge', stat='identity', fill="lightblue") +
  geom_text(aes(label=round(pc)), position=position_dodge(0.9), vjust=-0.5) +
  geom_text(aes(label=round(tot_q)), nudge_y = 15, col="red")

myplot

Question: why do I get the tot_q value twice (the red numbers)? Furthermore, how might I hide the lower percentage (e.g. in category 1 I would only want to see 75%)? I imagine it has something to do with my pre-processing of the data but I can't figure out what to do differently.

enter image description here


Solution

  • using the subset data (quality = 1) for geom_text()

    library(ggplot2)
    library(dplyr)
    category <- as.factor(c(1, 2, 3, 3, 2, 2, 1, 2, 4, 4, 1, 3, 2, 2, 2, 1))
    quality <- as.factor(c(0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1))
    mydata <- data.frame(category, quality)
    
    mydata2<- mydata %>% group_by(category,quality) %>% mutate(count_q = n()) %>%
      group_by(category) %>% mutate(tot_q=n(),pc=count_q*100/tot_q)
    
    myplot <- ggplot(mydata2, aes(x= category, y = pc)) +
      geom_bar(position = 'dodge', stat='identity', fill="lightblue") +
      geom_text(data = filter(mydata2, quality == 1),
        aes(label=round(pc)), position=position_dodge(0.9), vjust=-0.5) +
      geom_text(data = filter(mydata2, quality == 1),
        aes(label=round(tot_q)), nudge_y = 15, col="red")
    
    myplot
    

    Created on 2020-04-21 by the reprex package (v0.3.0)