Search code examples
rggplot2dplyrgghighlight

Highlight Only Flagged Values


I am trying to highlight only certain points on a combo line + point plot in ggplot2.

Here's a bit more background. In this data set, whenever a value goes outside a certain range, it is flagged as out of spec. In the column "in_spec", if a value has a 0, it is out of the specified range. Here's the data:

dat <- structure(list(Date = structure(c(1592784000, 1592784000, 1592784000, 
                                         1592784000, 1592870400, 1592870400, 1592870400, 1592870400, 1593388800, 
                                         1593388800, 1593388800, 1593388800, 1593475200, 1593475200, 1593475200, 
                                         1593475200, 1593561600, 1593561600, 1593561600, 1593561600, 1592956800, 
                                         1593043200, 1593129600, 1593648000, 1594166400, 1594684800, 1594771200, 
                                         1594857600, 1594944000, 1594252800, 1594339200), tzone = "UTC", class = 
                                         c("POSIXct", "POSIXt")),
                      variable = c("var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                   "var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                   "var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                   "var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                   "var1", "var1", "var1"),
                      reading = c(100.1, 100.1, 100.1, 100.1, 100.09, 100.09, 100.09, 100.09, 100.14, 
                                  100.14, 100.14, 100.14, 100.13, 100.13, 100.13, 100.13, 100.14, 
                                  100.14, 100.14, 100.14, 100.08, 100.05, 90.53, 100.14, 100.14, 
                                  90.3, 100.15, 100.14, 100.13, NA, NA),
                      in_spec = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
                                  1, 1, 0, 1, 1, 0, 1, 1, 1, NA, NA)), 
                 row.names = c(NA, -31L), class = c("tbl_df", "tbl", "data.frame"))

Plotting the trend is easy enough and with gghighlight, I've been able to highlight the values I'm after. Here's the code and output:

p <- ggplot(dat, aes(x = Date, y = reading)) +
       #date along the x axis, reading as the y
    geom_point() +
       #first plot the points
    gghighlight::gghighlight(in_spec == 0) +
       #highlight points that are flagged with 0
    geom_line()
       #add the line connecting the points
p

Plot with Highlighted Points

This is almost right, except that it's connecting the two points that are "out of spec", even though they aren't actually next to each other in time.

How can I highlight just the "out of spec" points, but leave the line connecting all the other points? The end goal would be the same plot, but with just the two highlighted points below, no line between them.

I've tried rearranging the order of the geom_line and geom_point calls and having the gghighlight call in different spots too.


Solution

  • I found a solution, but it does not involve gghighlight. Setting the color when the points are added will paint the points as different colors. Important to note, however, the grouping column must not be a continuous variable, i.e., a factor or a boolean.

    dat <- structure(list(Date = structure(c(1592784000, 1592784000, 1592784000, 
                                             1592784000, 1592870400, 1592870400, 1592870400, 1592870400, 1593388800, 
                                             1593388800, 1593388800, 1593388800, 1593475200, 1593475200, 1593475200, 
                                             1593475200, 1593561600, 1593561600, 1593561600, 1593561600, 1592956800, 
                                             1593043200, 1593129600, 1593648000, 1594166400, 1594684800, 1594771200, 
                                             1594857600, 1594944000, 1594252800, 1594339200), tzone = "UTC", class = 
                                             c("POSIXct", "POSIXt")),
                          variable = c("var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                       "var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                       "var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                       "var1", "var1", "var1", "var1", "var1", "var1", "var1", 
                                       "var1", "var1", "var1"),
                          reading = c(100.1, 100.1, 100.1, 100.1, 100.09, 100.09, 100.09, 100.09, 100.14, 
                                      100.14, 100.14, 100.14, 100.13, 100.13, 100.13, 100.13, 100.14, 
                                      100.14, 100.14, 100.14, 100.08, 100.05, 90.53, 100.14, 100.14, 
                                      90.3, 100.15, 100.14, 100.13, NA, NA),
                          in_spec = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
                                      1, 1, 0, 1, 1, 0, 1, 1, 1, NA, NA)), 
                     row.names = c(NA, -31L), class = c("tbl_df", "tbl", "data.frame"))
    
    dat <- dat %>%
      mutate(in_spec = as.factor(in_spec))
    #convert the in_spec column to a factor so the scale_color_manual will work below
    
    p <- ggplot(dat, aes(x = Date, y = reading)) +
        geom_line(color = "grey") +
        #add a line connecting the readings, color it grey
        geom_point(aes(color = in_spec), size = 1) +
        #add points for each of the readings, grouping by in_spec value
        #this results in two groups for the points, one group for in spec,
        #one group for out of spec.
        scale_color_manual(values = c('red', 'grey'), guide = FALSE)
        #set the manual color scale so out of spec readings are red and in spec are grey
    p
    

    And the resulting chart:

    Red Highlighted Plot