Search code examples
rggplot2scatter-plothighlight

Bring highligthed points to the front in ggplot


I am trying to plot a scatterplot for 2 variables of a large timeseries dataset in R, and I would like to highlight the data from one of the months and bring it upfront. I ahve tried some previous suggested solutions in the forums but they do not seem to work (maybe is because the questions are a bit old and some arguments could be changed with newer versions) So far I have this:

set.seed(123)

date=seq(as.POSIXct("2022-04-01 00:00:00"), as.POSIXct("2022-10-31 23:00:00"), by = "hour")
t= abs(rnorm(length(date)))
y= exp(t)+ rnorm(length(date), mean = 0, sd = 3)

df<-data.frame(date=date,t=t,y=y)

df$month<-month(df$date)

highlight_month <- 1 
non_highlighted_colors <- rep("grey", length(unique(df$month)))
non_highlighted_colors[highlight_month] <- "red"
df$order<-ifelse(df$month==highlight_month,1,2)

ggplot(df, aes(t, y)) +
  geom_point(aes(color = factor(month),order=order)) +
  scale_color_manual(values = non_highlighted_colors) +
  labs(color = "Month") +
  theme_minimal()

The first thing I get is that order has been ignored. I think maybe it is because I notice that if I highlightmonth 1 in the code that means month 4 in the dataframe, and when I run order it will search for january, which is not in the data.

Is this the reason the code is not working.

Thank you for any suggestion


Solution

  • You can make thinks simpler by mapping the colour to a condition, and by specifying a manual colour scale:

    ggplot(df, aes(t, y)) +
    geom_point(aes(colour = month == 4)) +
    scale_colour_manual(values = c("grey", "red")) +
    labs(colour = "Month") +
    theme_minimal()
    

    plot1

    But you probably want to bring the highlighted points to the front, so you'll need to split the plotting into two geom_points to make sure that the highlighted points get drawn after (i.e. on top) of the grey ones:

    ggplot(df, aes(t, y)) +
      geom_point(data = df[df$month != 4, ], aes(colour = month == 4)) +
      geom_point(data = df[df$month == 4, ], aes(colour = month == 4)) +
      scale_colour_manual(values = c("grey", "red")) +
      labs(colour = "Month") +
      theme_minimal()
    

    plot2

    You probably want a nicer legend, so you can do something like construct a factor variable with the highlighting condition and map that to color:

    df$highlight <- factor(df$month == 4,
                           levels = c(T, F),
                           labels = c("April", "Other"))
    
    ggplot(df, aes(t, y)) +
      geom_point(data = df[df$highlight == "Other", ], aes(colour = highlight)) +
      geom_point(data = df[df$highlight == "April", ], aes(colour = highlight)) +
      scale_colour_manual(values = c("grey", "red")) +
      labs(colour = "Month") +
      theme_minimal()
    

    plot3

    But since the order of the legend is the plotting order, Other comes first in the legend, and it looks weird. It can be corrected by specifying both the breaks and the values for the colour scale:

    ggplot(df, aes(t, y)) +
      geom_point(data = df[df$highlight == "Other", ], aes(colour = highlight)) +
      geom_point(data = df[df$highlight == "April", ], aes(colour = highlight)) +
      scale_colour_manual(breaks = c("April", "Other"),
                          values = c("red", "grey")) +
      labs(colour = "Month") +
      theme_minimal()
    

    enter image description here