Search code examples
rplotggplot2labelline-plot

Selective labeling for ggplot lines


General Goal: Use ggplot to selectively label only lines whose last points are above a certain y value.

Potential Functions/Packages: I'm aware of the geom_text() function and directlabels package but I can't identify a way in their documentation to selectively label lines in the way I described above.

Sample Data

ID <- c(rep("ID1", 5), rep("ID2", 5), rep("ID3", 5), rep("ID4", 5), rep("ID5", 5))
Y <- c(1, 2, 3, 4, 5, 
       10, 20, 30, 40, 1, 
       5, 10, 15, 10, 60, 
       50, 30, 20, 25, 10,
       20, 25, 30, 35, 50)
Year <- c(rep(seq(2000 ,2004), 5))
DATA <- data.frame(ID, Year, Y)

Plot Data

ggplot(data=DATA, aes(Year, Y)) + 
  geom_line(aes(y=Y, x=Year, color=ID)) + 
  theme_bw()

Plot

Problem

In the case of the above plot, is there a way to use gg_text(), directlabels, or any other functions to automatically (rather than manually) label only the lines whose last point is Y >= 50 (the purple and green lines) according to their IDs?

Thanks a lot for your help!


Solution

  • You can add the labels on the fly if you wish by filtering the data to get the appropriate label locations. For example:

    ggplot(data=DATA, aes(Year, Y, color=ID)) + 
      geom_line() + 
      geom_text(data=DATA %>% group_by(ID) %>% 
                  arrange(desc(Year)) %>% 
                  slice(1) %>% 
                  filter(Y >= 50),
                aes(x = Year + 0.03, label=ID), hjust=0) +
      theme_bw() +
      guides(colour=FALSE) +
      expand_limits(x = max(DATA$Year) + 0.03)
    

    enter image description here