Search code examples
rcluster-analysisdendrogramdendextend

How do I add string variables to a dendrogram with labels coloured by factor level?


One of the answers to this question colour codes the labels of a dendogram for a subset of the iris dataset. What I would like to do is retain the string names for the labels so that they'd say setosa, virginica etc.along with their colours.

Here's the code

# install.packages("dendextend")
library(dendextend)

small_iris <- iris[c(1, 51, 101, 2, 52, 102), ]
dend <- as.dendrogram(hclust(dist(small_iris[,-5])))
# Like: 
# dend <- small_iris[,-5] %>% dist %>% hclust %>% as.dendrogram

# By default, the dend has no colors to the labels
labels_colors(dend)
par(mfrow = c(1,2))
plot(dend, main = "Original dend")

# let's add some color:
colors_to_use <- as.numeric(small_iris[,5])
colors_to_use
# But sort them based on their order in dend:
colors_to_use <- colors_to_use[order.dendrogram(dend)]
colors_to_use
# Now we can use them
labels_colors(dend) <- colors_to_use
# Now each state has a color
labels_colors(dend) 
plot(dend, main = "A color for every Species")

Solution

  • You need to update the labels just before plotting. For example using labels(dend) <- small_iris[,5][order.dendrogram(dend)]

    Full code and output:

    # install.packages("dendextend")
    library(dendextend)
    
    small_iris <- iris[c(1, 51, 101, 2, 52, 102), ]
    dend <- as.dendrogram(hclust(dist(small_iris[,-5])))
    # Like: 
    # dend <- small_iris[,-5] %>% dist %>% hclust %>% as.dendrogram
    
    # By default, the dend has no colors to the labels
    labels_colors(dend)
    par(mfrow = c(1,2))
    plot(dend, main = "Original dend")
    
    # let's add some color:
    colors_to_use <- as.numeric(small_iris[,5])
    colors_to_use
    # But sort them based on their order in dend:
    colors_to_use <- colors_to_use[order.dendrogram(dend)]
    colors_to_use
    # Now we can use them
    labels_colors(dend) <- colors_to_use
    # Now each state has a color
    labels_colors(dend) 
    
    ### UPDATE <--------------------------------
    labels(dend) <- small_iris[,5][order.dendrogram(dend)]
    
    
    plot(dend, main = "A color for every Species")
    

    enter image description here