Search code examples
rggplot2geom-baraestheticsgeom-point

Fix the order of variables in geom_point plotted on top of geom_col


I have a data table with two categorical variables and one numeric variable.

Here's code to generate the sample data:

data <- data.frame(system = rep(c("X","Y","Z"), 10), 
                   region = rep(letters[1:5], 6), 
                   value = rnorm(60, 500, 300)) 

Now I want to plot the system-region mean of value AND overlay the system-mean against the system-region mean.

Here is the code to build the data for plotting and the first plot:

plot_data <- data %>%
  mutate(system = factor(system), region = factor(region)) %>%
  group_by(system, region) %>%
  summarise(avg = mean(value), .groups = "drop") %>%
  left_join(y = data %>% group_by(system) %>% summarise(avg = mean(value), .groups = "drop"), by = "system", suffix = c("", "_all")) %>%
  mutate(point_type = ifelse(avg_all > avg, "above", "in"))

ggplot(plot_data, aes(x = region, y = avg, fill = system)) +
  geom_col(position = "dodge") +
  geom_point(aes(y = avg_all), shape = 21, position = position_dodge(width = 0.9))

example plot

But now, if I want to add a color aesthetic to geom_point, like this:

ggplot(plot_data, aes(x = region, y = avg, fill = system)) +
  geom_col(position = "dodge") +
  geom_point(aes(y = avg_all, color = point_type), shape = 21, position = position_dodge(width = 0.9))

The graph is no longer arranging the points within position.dodge in the same order as the columns. Note in region 'b' the green and blue points/bars are misaligned, in region 'd' the red and green points/bars are misaligned, and in region 'e' the red, green, and blue points/bars are misaligned. I cannot figure out why. The misalignment is not systematic, but I tried position = position.dodge2(reverse = T) and that did not fix the problem.

enter image description here


Solution

  • The issue is that by adding the color aes you changed the grouping of the data used for geom_point. To fix that you have to use the group aes, to tell ggplot2 that you want the points grouped and dodged by system.

    library(ggplot2)
    
    ggplot(plot_data, aes(x = region, y = avg, fill = system)) +
      geom_col(position = "dodge") +
      geom_point(aes(y = avg_all, color = point_type, group = system),
        shape = 21, position = position_dodge(width = 0.9)
      )
    

    enter image description here