Search code examples
rdendextendggdendro

Preserving label and legend color across 2 dendograms


Objective: I would like to preserve label color and legend color across 2 dendograms created for the same dataset.

I have the same dataset (40 observations) that is converted into a dendogram in 2 processes (pre-filtered and filtered). However, the label colors get changed based on how it gets clustered (and therefore the order of labels in the dendogram gets changed).

Here is a code snippet:

library(dendextend)
small_mtcars <- head(mtcars)

small_mtcars

d1 = small_mtcars %>% select(mpg, cyl, disp) %>% dist() %>% hclust(method = "average") %>% as.dendrogram() 
d2 = small_mtcars %>% select(mpg, cyl, disp) %>% dist() %>% hclust(method = "complete") %>% as.dendrogram() 


par(mar = c(10,4,4,2) + 0.1)

# Plotting d1 

test <- d1 %>% 
  set("labels_cex",0.7) %>% 
  plot(main="d1")
legend("topright", legend=unique(rownames(small_mtcars)[order.dendrogram(d1)]), cex=0.75, bty="n",
       fill=seq(1,length(unique(rownames(small_mtcars)[order.dendrogram(d1)]))))

# Plotting d2 

test2 <- d2 %>% 
  set("labels_cex",0.7) %>% 
  plot(main="d2")
legend("topright", legend=unique(rownames(small_mtcars)[order.dendrogram(d2)]), cex=0.75, bty="n",
       fill=seq(1,length(unique(rownames(small_mtcars)[order.dendrogram(d2)]))))

d1_dendogram d2_dendogram

Based on the code snippet above, here are the 2 things I want to achieve

  1. Color legend should be same for both dendograms (in the attached images Valiant model is green in d1_dendogram but violet in d2_dendogram)
  2. I would like to color code the leaf label with same color as the legend

Thanks in advance.


Solution

  • You have many things to re-do in your code. I've fixed it so now it works. If you have followup questions, you can post them as a comment :)

    library(dendextend)
    library(dplyr)
    small_mtcars <- head(mtcars) %>% select(mpg, cyl, disp)
    
    small_mtcars
    
    d1 = small_mtcars %>% dist() %>% hclust(method = "average") %>% as.dendrogram() 
    d2 = small_mtcars %>% dist() %>% hclust(method = "complete") %>% as.dendrogram() 
    
    library(colorspace)
    some_colors <- rainbow_hcl(nrow(small_mtcars))
    
    d1_col <- some_colors[order.dendrogram(d1)]
    d2_col <- some_colors[order.dendrogram(d2)]
    
    labels_colors(d1) <- d1_col
    labels_colors(d2) <- d2_col
    
    par(mfrow = c(1,2))
    
    # Plotting d1 
    
    the_labels <- rownames(small_mtcars)
    
    d1 %>% 
        set("labels_cex",0.7) %>% 
        plot(main="d1", xlim = c(1,9))
    legend("topright", legend=the_labels, cex=0.75, bty="n",
           fill=some_colors)
    
    # Plotting d2 
    
    d2 %>% 
        set("labels_cex",0.7) %>% 
        plot(main="d2", xlim = c(1,9))
    legend("topright", legend=the_labels, cex=0.75, bty="n",
           fill=some_colors)
    

    OUTPUT:

    enter image description here