Search code examples
rlabeldendrogramdendextend

R and rect.hclust: rectangle on labels in dendrograms


I am building a dendrogram for the first time and the rectangles around clusters are drawn on top of the labels. Do you know how modify the positioning of these labels in order to avoid this overlap?

Here you can find a working example of my code:

mydata <- c(9.45, 10.54, 10.36, 10.46, 10.78, 10.1, 11.13)
mydata.matrix <- matrix(mydata, nrow = 1, ncol = 7)
colnames(mydata.matrix) <- c("a", "b", "c", "d", "e", "f", "g")
rownames(mydata.matrix) <- c("X")

d <- dist(mydata.matrix["X", ], method = "euclidean")
fit <- hclust(d, method="ward.D")

nodePar <- list(lab.cex = 0.6, pch = c(NA, 19), cex = 0.7, col = "blue")
plot(as.dendrogram(fit), xlab = "", sub="", ylab = "Euclidean distance",
     main = "Dendrogram", nodePar = nodePar)

rect.hclust(fit, k=2, border="red")

And here is the plot from the code above:

Worked example of dendrogram

In particular I would like to have the red rectangles contain entirely the labels of the leaves of the dendrogram.

Thank you!


Solution

  • You should use the rect.dendrogram function from the dendextend package.

    For example:

    mydata <- c(9.45, 10.54, 10.36, 10.46, 10.78, 10.1, 11.13)
    mydata.matrix <- matrix(mydata, nrow = 1, ncol = 7)
    colnames(mydata.matrix) <- c("a", "b", "c", "d", "e", "f", "g")
    rownames(mydata.matrix) <- c("X")
    
    d <- dist(mydata.matrix["X", ], method = "euclidean")
    fit <- hclust(d, method="ward.D")
    
    nodePar <- list(lab.cex = 0.6, pch = c(NA, 19), cex = 0.7, col = "blue")
    dend <- as.dendrogram(fit)
    plot(dend, xlab = "", sub="", ylab = "Euclidean distance",
         main = "Dendrogram", nodePar = nodePar)
    
    library(dendextend)
    rect.dendrogram(dend , k=2, border="red")
    

    And you will get: enter image description here

    In general, for plotting dendrograms, you might find the following quick introduction to dendextend useful (or look at the more lengthy version).