Search code examples
rggplot2facet

How to highlight points in a facet grid with ggplot?


I have the following data in long form:

data <- '"","n","variable","value"
"1",1,"adjr2",0.0365013693015789
"2",2,"adjr2",0.0514307495746085
"3",3,"adjr2",0.0547096973547058
"4",4,"adjr2",0.0552737311430782
"5",5,"adjr2",0.0552933455488706
"6",6,"adjr2",0.0552904097804204
"7",1,"cp",631.119186022639
"8",2,"cp",132.230096988504
"9",3,"cp",23.4429422708563
"10",4,"cp",5.55840294833615
"11",5,"cp",5.9017131979017
"12",6,"cp",7
"13",1,"bic",-1156.56144387716
"14",2,"bic",-1641.2046046544
"15",3,"bic",-1741.38235791823
"16",4,"bic",-1750.90145310605
"17",5,"bic",-1742.19643112204
"18",6,"bic",-1732.73634326858'

df <- read.csv(text=data)

I want to create a point plot for every variable. Currently, I'm doing this with ggplot2:

ggplot(df) + geom_point(aes(x = n, y = value, fill = variable)) + 
    facet_grid(variable ~ ., scale="free_y")

The result is the following: plot

I would now like to highlight with a different colour one point for each subplot. I cannot figure out how to add it to the current geom_point, is it even possible?

For example, how would I highlight the maximum in the first subplot and the minimum in the other two? Like this, for the first one: plot

I found a way to do it manually with three separate plots which are then joined in a grid, but that solution is 25 lines and there's a lot of repeated code. Is there a way to do it by just slightly modifying the above snippet?

(By the way, the minimum and maximum are found as which.min(df$value[df$variable == 'cp']), etc.)


Solution

  • You could add a column to mark the maximum or minimum value in each facet. The code below adds a column to mark the maximum value in facets where a linear regression fit has a positive slope and the minimum value when the slope is negative. This added column is then mapped to a colour aesthetic to set the point colors. (You can also make the highlighted points larger and/or use a different point marker for them by mapping the new column to, respectively, size and shape aesthetics.)

    library(dplyr)
    
    df = df %>% 
      group_by(variable) %>%                      # Group by the faceting variable
      mutate(highlight = coef(lm(value ~ n))[2],  # Get slope for each facet
             highlight = ifelse(highlight > 0,    # Mark max or min value, depending on slope
                                ifelse(value==max(value),"Y","N"),
                                ifelse(value==min(value),"Y","N")))
    
    ggplot(df) + 
      geom_point(aes(x = n, y = value, colour=highlight), size=2, show.legend=FALSE) + 
      facet_grid(variable ~ ., scale="free_y") +
      scale_colour_manual(values=c("black","red")) +
      theme_bw()
    

    enter image description here

    You can do this without permanently adding the new column to your data frame by piping the data frame directly to ggplot instead of saving the updated data frame first:

    df %>% 
      group_by(variable) %>% 
      mutate(highlight = coef(lm(value ~ n))[2],
             highlight = ifelse(highlight > 0, 
                                ifelse(value==max(value),"Y","N"),
                                ifelse(value==min(value),"Y","N"))) %>% 
      ggplot() + 
        geom_point(aes(x=n, y=value, colour=highlight), size=2, show.legend=FALSE) + 
        facet_grid(variable ~ ., scale="free_y") +
        scale_colour_manual(values=c("black","red")) +
        theme_bw()