Search code examples
rcluster-analysisdendrogramdendextend

How to color the same labels on dendorgram in one colour in r


I have clustered some data in r and plotted the results as a dendrogram. What i am trying to find out right now is how I can change the colour of the labels, so that labels that are the same have the same colour.

I got my dendrogram using the following code:

> d<-stringdist::stringdistmatrix(AR_GenesforR$AR_Genes)
> cl <-hclust(as.dist(d))
> plot(cl, label=AR_GenesforR$AR_Genes)
> groups <- cutree(cl, k=2)
> rect.hclust(cl, k=2, border="red")

The resulting dendrogram looks like this: enter image description here

What I want to do now, is to colour all labels that are the same in the same colour, eg. all 2010 in yellow, all 2011 in blue and so on. I have researched quite a bit, but mostly only found ways to colour the labels according to the clusters they are in. Does someone know how I can do what I want?


Solution

  • Here is a function that will do what you ask, based on the dendextend R package (here is a short 2 page paper on the package).

    x <- c(2011,2011,2012,2012,2015,2015,2015)
    names(x) <- x
    dend <- as.dendrogram(hclust(dist(x)))
    
    color_unique_labels <- function(dend, ...) {
        if(!require(dendextend)) install.packages("dendextend")
        if(!require(colorspace)) install.packages("colorspace")
        library("dendextend")
    
        n_unique_labels <- length(unique(labels(dend)))
        colors <- colorspace::rainbow_hcl(n_unique_labels)
        labels_number <- as.numeric(factor(labels(dend)))
        labels_colors(dend) <- colors[labels_number]
        dend
    }
    
    
    par(mfrow = c(1,2))
    plot(dend)
    dend2 <- color_unique_labels(dend)
    plot(dend2)
    

    enter image description here