Search code examples
rgraphdata-visualizationlinepoint

R: connect points on a graph (ggplot2)


Suppose I have data in the following form:

library(ggplot2)

Data <- data.frame(
    
    "ID" = c("ABC111", "ABC111", "ABC111", "ABC111", "ABC112", "ABC112", "ABC112", "ABC113", "ABC113", "ABC114", "ABC115"),
"color" = c("red", "red", "red", "red", "blue", "blue", "blue", "green", "green", "black", "yellow"),
    "start_date" = c("2005/01/01", "2006/01/01", "2007/01/01", "2008/01/01", "2009/01/01", "2010/01/01", "2011/01/01", "2012/01/01", "2013/01/01", "2014/01/01", "2015/01/01"),
    "end_date" = c("2005/09/01", "2006/06/01", "2007/04/01", "2008/05/07", "2009/06/01", "2010/10/01", "2011/12/12", "2013/05/01", "2013/06/08", "2015/01/01", "2016/08/09")
)

Data$ID = as.factor(Data$ID)
Data$color = as.factor(Data$color)

Now what I want to do is for each row, plot the start_date and the end_date ... and then connect them with a straight line. I believe this can be done with geom_line() in ggplot2.

I want something that looks like this:enter image description here

I tried using the following code:

q <- qplot(start_date, end_date, data=Data)
q <- q + geom_line(aes(group = ID))
q

But the graph looks completely different than what I expected.

Can anyone please show me what I am doing wrong?

Thanks


Solution

  • Here's a solution using the tidyverse package. I used the number of each row in the original data as the y-axis values in the plot. As these values are meaningless, I removed the y-axis title, labels and ticks from the plot.

    library(tidyverse)
    
    Data %>%
      # Number each row in its order of appearance, 
      # save this numbers in a new column named order
      rowid_to_column("order") %>%
      # Change data from wide to long format
      pivot_longer(cols = c(start_date, end_date),
                   names_to = "date_type",
                   values_to = "date") %>%
      # Ggplot, use date as x, order as y, ID as col and order as group
      ggplot(aes(x = date, 
                 y = order,  
                 col = ID, 
                 group = order)) +
      # Draw points
      geom_point()+
      # Draw lines
      geom_line() +
      # Maybe you want to remove the y axis title, text and ticks
      theme(axis.title.y = element_blank(),
            axis.text.y = element_blank(),
            axis.ticks.y = element_blank(),
            # I added a vertical format to the x axis labels 
            # it might easier to read this way
            axis.text.x = element_text(angle = 90, vjust = 0.5))
    

    points-linesplot