Search code examples
rggplot2geom-bar

ggplot2 select categories for bar chart and create labels


I am trying make bar chart with ggplot2 with the dataset below. When I use the code

ggplot(p.data, aes(x = `Period Number`, y = `Total Jumps`)) +
  stat_summary(data = subset(p.data, Status = "Starter"), fun ="mean", geom = "bar")

I get this graph:

plot

The most concerning aspect is the for period 2, 3, 4, and 5 the bars should be taller (period 2 should be around 9.9). Additionally, I would like to remove period 0 and period 1 and add bar labels with the raw data and without creating an additional data frame.

p.data <-   structure(list(`Period Number` = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), 
    `Total Jumps` = c(112L, 97L, 28L, 132L, 162L, 19L, 92L, 112L, 
    97L, 141L, 68L, 86L, 76L, 26L, 105L, 125L, 19L, 92L, 112L, 
    64L, 101L, 68L, 4L, 8L, 0L, 8L, 12L, 0L, 0L, 0L, 13L, 8L, 
    0L, 8L, 2L, 2L, 5L, 12L, 0L, 0L, 0L, 5L, 11L, 0L, 0L, 6L, 
    0L, 9L, 8L, 0L, 0L, 0L, 7L, 10L, 0L, 14L, 5L, 0L, 5L, 5L, 
    0L, 0L, 0L, 8L, 11L, 0L, 108L, 131L, 47L, 136L, 159L, 35L, 
    114L, 116L, 111L, 190L, 64L, 75L, 95L, 47L, 116L, 123L, 27L, 
    103L, 108L, 70L, 152L, 64L, 4L, 7L, 0L, 14L, 10L, 0L, 0L, 
    0L, 15L, 10L, 0L, 4L, 0L, 0L, 3L, 7L, 7L, 8L, 8L, 5L, 10L, 
    0L, 7L, 14L, 0L, 3L, 10L, 1L, 0L, 0L, 11L, 7L, 0L, 18L, 15L, 
    0L, 0L, 9L, 0L, 3L, 0L, 10L, 11L, 0L, 118L, 96L, 48L, 143L, 
    170L, 37L, 118L, 117L, 116L, 165L, 56L, 80L, 68L, 48L, 114L, 
    130L, 36L, 114L, 107L, 80L, 123L, 56L, 2L, 10L, 0L, 8L, 11L, 
    0L, 0L, 0L, 5L, 9L, 0L, 4L, 12L, 0L, 6L, 5L, 0L, 4L, 8L, 
    12L, 8L, 0L, 7L, 4L, 0L, 10L, 10L, 0L, 0L, 0L, 12L, 13L, 
    0L, 25L, 2L, 0L, 5L, 14L, 1L, 0L, 2L, 7L, 12L, 0L), Status = structure(c(1L, 
    2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 
    1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 
    2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 
    1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 
    1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 
    1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 
    2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 
    1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 
    1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 
    2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 
    2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 
    2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 
    1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 
    2L, 1L), .Label = c("Bench", "Starter"), class = "factor")), row.names = c(NA, 
198L), class = "data.frame")

Thank you for your help!


Solution

  • It's best to pass that data you actually want to plot to the plotting function, rather than trying to coerce it within the plotting function. In this case you were trying to subset a different data frame from the one you passed to ggplot inside stat_summary. The call to ggplot had already set up the aesthetics you wanted mapped, then in your only geom layer, you were telling ggplot you wanted a completely different set of aesthetics.

    You don't need to create another data frame to reshape your data. Here's how you could do it using dplyr:

    library(dplyr)
    library(ggplot2)
    
    p.data %>% 
      filter(Status == "Starter") %>% 
      group_by(`Period Number`) %>%
      summarise(`Total Jumps` = mean(`Total Jumps`)) %>%
      filter(`Period Number` > 1) %>%
      ggplot(aes(x = `Period Number`, y = `Total Jumps`)) + 
      geom_col(fill = "dodgerblue", colour = "black") +
      geom_text(aes(y = `Total Jumps` + 1, label = signif(`Total Jumps`, 2)))
    

    enter image description here