Search code examples
rggplot2plotmedian

GGplot2 Add medians in two different part of the graph


I am trying to plot median values into a graph for a single case study. Unfortunately, I cannot get the result I would like to have. Here's my code.

Rate <- c(8, 4, 5, 5, 5, 4, 8, 7, 18, 15, 9, 8, 17, 19, 11, 11)
Phase <- c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B")
Occasions <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
db4 <- data.frame(Occasions, Phase, Rate)

medians <- db4 %>% 
      group_by(Phase) %>% 
      mutate(M = median(Rate))

    ggplot(data = db4, aes(x = Occasions, y = Rate)) +
      geom_point(aes(colour = Phase)) +
      geom_line(aes(group = Phase, color = Phase)) +
      geom_hline(yintercept = medians$M, colour = 'red') +
      geom_vline(xintercept = 8.5, linetype = "dashed", color = "blue", size = 0.5) +
      xlab("Occasions") +
      ylab("Ratings")

Below, you can find the plot that comes from this code. However, both median lines are currently red and extend across the whole plot. I would like to have the first red median line only in the left side of the graph and a second BLUE median line only in the right side of the graph. I also would like to add titles for the two sides which are "Baseline" and "Intervention"

Thanks for any help!


Solution

  • Here is how I would approach this plot. I wouldn't use facets because you want to keep everything on one axis, but I wouldn't use two titles either. Instead, just relabel your legend to make clear which points are Baseline and which are Intervention.

    To fix the median lines and prevent them crossing the whole plot, you need to use geom_segment instead of geom_hline. This also means you need to add some more columns to your medians data frame to provide the right inputs (start and end coordinates for the segments). I move color = Phase to the top ggplot() so geom-segment inherits and splits the two phases. Finally I made geom_vline use aesthetics for xintercept so its position isn't hardcoded (may or may not be ideal in your case).

    library(tidyverse)
    db4 <- data.frame(
      Rate = c(8, 4, 5, 5, 5, 4, 8, 7, 18, 15, 9, 8, 17, 19, 11, 11),
      Phase = c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B"),
      Occasions = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
    )
    
    medians <- db4 %>% 
      group_by(Phase) %>%
      summarise(
        y = median(Rate),
        x_start = min(Occasions),
        x_end = max(Occasions)
        )
    
    ggplot(data = db4, aes(x = Occasions, y = Rate, color = Phase)) +
      geom_point() +
      geom_line() +
      geom_segment(
        data = medians,
        mapping = aes(x = x_start, xend = x_end, y = y, yend = y)
        ) +
      geom_vline(
        mapping = aes(xintercept = median(Occasions)),
        linetype = "dashed",
        color = "blue",
        size = 0.5
        ) +
      labs(x = "Occasions", y = "Ratings") + 
      scale_colour_hue(labels = c("Baseline", "Intervention"))
    

    Created on 2018-08-13 by the reprex package (v0.2.0).