Search code examples
rvisualizationhierarchical-clusteringdendrogramdendextend

Color dendrogram branches based on external labels uptowards the root until the label matches


From question Color branches of dendrogram using an existing column, I can color the branches near the leaf of the dendrogram. The code:

x<-1:100
dim(x)<-c(10,10)
set.seed(1)
groups<-c("red","red", "red", "red", "blue", "blue", "blue","blue", "red", "blue")
x.clust<-as.dendrogram(hclust(dist(x)))

x.clust.dend <- x.clust
labels_colors(x.clust.dend) <- groups
x.clust.dend <- assign_values_to_leaves_edgePar(x.clust.dend, value = groups, edgePar = "col") # add the colors.
x.clust.dend <- assign_values_to_leaves_edgePar(x.clust.dend, value = 3, edgePar = "lwd") # make the lines thick
plot(x.clust.dend) 

generates a dendrogram as shown in:enter image description here However, I want to color the branches up towards the root until the all the leaves in the current branch have the same labels. Even if there is a single mismatch switch to the default color of black. I want the resulting dendrogram to look like enter image description here

What I want is little different from using color_branches like

x.clust.dend <-color_branches(x.clust.dend,k=3)

because it colors based on its own clusters not based on some external labels.


Solution

  • The function you are looking for is branches_attr_by_clusters. Here is how to use it:

    library(dendextend)
    
    x <- 1:100
    dim(x) <- c(10, 10)
    set.seed(1)
    groups <- c("red","red", "red", "red", "blue", "blue", "blue","blue", "red", "blue")
    dend <- as.dendrogram(hclust(dist(x)))
    
    clusters <- as.numeric(factor(groups, levels = c("red", "blue")))
    dend2 <-
      branches_attr_by_clusters(dend , clusters, values = groups)
    plot(dend2)
    

    enter image description here

    This function was originally created to display the results of dynamicTreeCut. See the vignette for another example.