Search code examples
javascriptrggplot2d3.jsdendrogram

How can I produce plots like this?


I have come across this kind of a plot that performs hierarchical clustering over a given set of timeseries data. Can someone tell me how to draw such plots?

I am open to implementations in R or Javascript, especially using d3.js.

enter image description here


Solution

  • You can always create the plot by hand: with base graphics, you the fig parameter allows you to add plots inside another plot.

    # Sample data
    n <- 100
    k <- 6
    d <- matrix(rnorm(k*n),nc=k)
    d[,2] <- d[,1]  # To help check the results
    colnames(d) <- LETTERS[1:k]
    x <- apply(d,2,cumsum)
    r <- hclust(dist(t(d)))
    # Plot
    op <- par(mar=c(0,0,0,0),oma=c(0,2,0,0))
    plot(NA,ylim=c(.5,k+.5), xlim=c(0,4),axes=FALSE)
    # Dendrogram. See ?hclust for details.
    xc <- yc <- rep(NA,k)
    o <- 1:k
    o[r$order] <- 1:k
    for(i in 1:(k-1)) {
      a <- r$merge[i,1]
      x1 <- if( a<0 ) o[-a] else xc[a]
      y1 <- if( a<0 ) 0 else yc[a]
      b <- r$merge[i,2]
      x2 <- if( b<0 ) o[-b] else xc[b]
      y2 <- if( b<0 ) 0 else yc[b]
      lines( 
        3+c(y1,i,i,y2)/k,
        c(x1,x1,x2,x2),
        lwd=k-i
      )
      xc[i] <- (x1+x2)/2
      yc[i] <- i
    }
    # Time series
    axis(2,1:k,colnames(d)[r$order],las=1)
    u <- par()$usr
    for(i in 1:k) {
      f <- c(0,3,i-.5,i+.5)
      f <- c( 
        (f[1]-u[1])/(u[2]-u[1]), 
        (f[2]-u[1])/(u[2]-u[1]), 
        (f[3]-u[3])/(u[4]-u[3]), 
        (f[4]-u[3])/(u[4]-u[3])
      )
      par(new=TRUE,fig=f)
      plot(x[,r$order[i]],axes=FALSE,xlab="",ylab="",main="",type="l",col="navy",lwd=2)
      box()
    }
    par(op)
    

    Dendogram with time series

    (After writing this, I realize that it is probably easier to do with layout...)