Search code examples
rggplot2time-seriesclassificationtimeserieschart

R ggplot colour labelling time series based on class


I have two time series as below:

y1 <- mvrnorm(50, c(3,1), matrix(c(0.5,0.3,0.3,0.3),2,2))# 2-D bivariate normal
y2 <- mvrnorm(50, c(1,0), matrix(c(2,.1,.1,1),2,2))# another 2-D bivariate normal
y <- rbind(y1,y2) # append the second to the end of the first

I plot these with ggplot:

yd <- as.data.frame(y)
g<- ggplot(data=yd) +
    geom_line(aes(x=1:nrow(yd), y=yd$V1, colour= "TS1"))+
    geom_line(aes(x=1:nrow(yd), y=yd$V2, colour= "TS2"))+
    scale_colour_manual(name= "Levels",
                        values = c("TS1"= "black",
                                   "TS2" ="blue"))+
    labs(title="Two time series")+
    xlab("Time") +
    ylab("Levels") +
    theme(legend.justification = c(1, 0), legend.position = c(1, 0))

Figure 1

I then run a classifier which creates a numeric vector of class labels for each time point. Below I plot the posteriors and provide the label vector.

Figure 2

dput(labels)
c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L)

I want to be able to colour-code Figure 1 with respect to the class labels derived from the above labels vector. To be clear, I want to be able to see what state (class) I am in at any given time, rather than just seeing the state shift boundaries. The most intuitive way of doing that I think is to change the background colour (e.g. from gray to orange) when state shifts to class 2.

What is the best way of achieving this in ggplot? I am obviously open to other solution suggestions.


Solution

  • You can add something like a background color with geom_ribbon.

    # creating background data
    df_bg <- data.frame(x = c(0, rep(which(as.logical(diff(labels))), each=2), length(labels)), 
                        ymin = 1.1*min(yd$V1, yd$V2), 
                        ymax = 1.1*max(yd$V1, yd$V2), 
                        fill = factor(rep(unique(labels), each=2)))
    # plot 
    g <- ggplot(data=yd, aes(x = seq_along(V1))) +
      geom_ribbon(data = df_bg, 
                  aes(x = x, ymin=ymin, ymax=ymax, fill=fill), alpha=.2) +
      geom_line(aes(y=V1, color="TS1")) +
      geom_line(aes(y=V2, color="TS2")) +
      scale_colour_manual(name= "Levels",
                          values = c("TS1"= "black",
                                     "TS2" ="blue"))+
      labs(title="Two time series") +
      xlab("Time") +
      ylab("Levels") +
      theme(legend.justification = c(1, 0), legend.position = c(1, 0))