Search code examples
rdendrogram

r Adding labels to x-axis, y-axis in dendrograms


I have the code below for plotting a dendrogram in R:

library(colorspace)
library(dendextend)

data <- read.csv("test.csv")
data2 <- data[, -1]

site_labels <- as.factor(data[, 1])
sites_col <- rev(rainbow_hcl(12))[as.numeric(site_labels)]

d <- dist(data2)
hc <- hclust(d, method = "average")

dend <- as.dendrogram(hc)
dend <- color_branches(dend, k=3)

labels_colors(dend) <-
  rainbow_hcl(12)[sort_levels_values(
    as.numeric(data[, 1])[order.dendrogram(dend)]
  )]

labels(dend) <- paste(as.character(data[, 1])[order.dendrogram(dend)],
                      " (",labels(dend),")",
                      sep = "")

dend <- set(dend, "labels_cex", 0.5)
dend <- set(dend, "branches_lwd", 3)

par(mar = c(3,3,3,7))
plot(dend, horiz = TRUE, nodePar = list(cex = .007))

It works fine, but I would like to have labels for both the x- and y-axis of the dendrogram (like "Euclidean distance" for the x-axis and "OTUs" for the y-axis). According to the documentation, to achieve that one should define the xlab and ylab parameters in the call to plot(dend), but it has no effect.

I would also like to be able to change the limits of the x-axis scale (which, also according to the documentation, should be done with the xlim parameter, but by setting something like xlim=c(0, 70) I have the dendrogram reversed and the leaf labels get all messed).

Could someone give me some tips on how to solve both problems?

Here are the test data matrix I have used:

OTU,VAR1,VAR2,VAR3,VAR4,VAR5
OTU1,1,1,0,0,1
OTU2,1,0,0,0,0
OTU3,1,1,1,0,1
OTU4,0,0,0,1,1

Solution

  • You could use title to add a xlab and ylab and with argument line you could specify the position like this:

    par(mar = c(3,3,3,7))
    plot(dend, horiz = TRUE, nodePar = list(cex = .007))
    title(ylab = 'OTUs', line = 0)
    title(xlab = 'Euclidean distance', line = 2)
    

    Output:

    enter image description here


    Edit

    You could also use mtext and argument side to determine which axis you want to label. side=1 is the x-axis bottom and side=4 is the right side y-axis. With line you could specify the position which is more trial and error to what you want. Here is a reproducible example:

    library(colorspace)
    library(dendextend)
    data <- read.table(text = 'OTU,VAR1,VAR2,VAR3,VAR4,VAR5
    OTU1,1,1,0,0,1
    OTU2,1,0,0,0,0
    OTU3,1,1,1,0,1
    OTU4,0,0,0,1,1', header = TRUE, sep = ',')
    
    site_labels <- as.factor(data[, 1])
    sites_col <- rev(rainbow_hcl(12))[as.numeric(site_labels)]
    
    d <- dist(data)
    #> Warning in dist(data): NAs introduced by coercion
    hc <- hclust(d, method = "average")
    
    dend <- as.dendrogram(hc)
    dend <- color_branches(dend, k=3)
    
    labels_colors(dend) <-
      rainbow_hcl(12)[sort_levels_values(
        as.numeric(data[, 1])[order.dendrogram(dend)]
      )]
    #> Warning in sort_levels_values(as.numeric(data[, 1])[order.dendrogram(dend)]):
    #> NAs introduced by coercion
    
    labels(dend) <- paste(as.character(data[, 1])[order.dendrogram(dend)],
                          " (",labels(dend),")",
                          sep = "")
    
    dend <- set(dend, "labels_cex", 0.5)
    dend <- set(dend, "branches_lwd", 3)
    
    par(mar = c(3,3,3,7))
    plot(dend, horiz = TRUE, nodePar = list(cex = .007))
    mtext("OTUs", side=4, line=2)
    mtext('Euclidean distance', side = 1, line = 2)
    

    Created on 2022-10-16 with reprex v2.0.2