Search code examples
rsortingheatmapcomplexheatmap

ComplexHeat: column_order is ignored when column_split is used


I opened this issue in the ComplexHeatmap github, but I got no response so far, and was hoping someone here could lend a hand...

I am trying to make the simplest heatmap possible, but the behavior is not what I expect. I just want to produce a heatmap with no column clustering, and the samples in a specific group order (non-alphabetical), with groups separated by a gap.

The problem is, when I want to add such gap, the order is ignored, and defaults to an alphabetical order.

Check the following MWE with the iris data.

1- Create my data matrix (small subset of iris for just setosa and virginica Species), and a meta information data frame with just sample ID and Species (my grouping variable):

data(iris)
my_setosa=subset(iris, Species=="setosa")
my_virginica=subset(iris, Species=="virginica")
set.seed(123)
rows_used1 <- sort(sample(1:nrow(my_setosa), 5, replace = F))
rows_used2 <- sort(sample(1:nrow(my_virginica), 5, replace = F))
my_iris=rbind(my_setosa[rows_used1,], my_virginica[rows_used2,])
#
heat_mat=t(as.matrix(my_iris[,-ncol(my_iris)]))
meta_df=data.frame(ID=paste0("id",rownames(my_iris)), Species=my_iris[,ncol(my_iris)])
meta_df$Species=droplevels(meta_df$Species)
colnames(heat_mat)=meta_df$ID

These look like this:

> heat_mat
             id3 id14 id15 id31 id42 id114 id125 id137 id143 id150
Sepal.Length 4.7  4.3  5.8  4.8  4.5   5.7   6.7   6.3   5.8   5.9
Sepal.Width  3.2  3.0  4.0  3.1  2.3   2.5   3.3   3.4   2.7   3.0
Petal.Length 1.3  1.1  1.2  1.6  1.3   5.0   5.7   5.6   5.1   5.1
Petal.Width  0.2  0.1  0.2  0.2  0.3   2.0   2.1   2.4   1.9   1.8
> meta_df
      ID   Species
1    id3    setosa
2   id14    setosa
3   id15    setosa
4   id31    setosa
5   id42    setosa
6  id114 virginica
7  id125 virginica
8  id137 virginica
9  id143 virginica
10 id150 virginica

2- Define heatmap and Species grouping colors, column annotation (Species groups), column order and column split (gap between Species groups):

palette=grDevices::colorRampPalette(c("green","black","red"))(11)
col_vec=c("red","blue")
col_vec=stats::setNames(col_vec, levels(meta_df$Species))
column_ha <- ComplexHeatmap::HeatmapAnnotation(
               Species = meta_df$Species,
               col = list(Species = col_vec),
               show_legend = TRUE)
column_order=meta_df[order(meta_df$Species, decreasing=T),]$ID
column_split <- meta_df$Species

Note here (and this is the problem), that I want the virginica group on the left, and the setosa group on the right, so my column order is the following:

> column_order
 [1] "id114" "id125" "id137" "id143" "id150" "id3"   "id14"  "id15"  "id31"
[10] "id42"

3- Make the heatmap without column_split:

complex_heat <- ComplexHeatmap::Heatmap(heat_mat, cluster_rows = FALSE, cluster_columns = FALSE,
                                        row_dend_width = grid::unit(2, "inch"),
                                        rect_gp = grid::gpar(col = "white", lwd = 2),
                                        col = palette,
                                        top_annotation = column_ha,
                                        column_order = column_order,
                                        #column_split = column_split,
                                        column_gap = grid::unit(0.1, "inch"),
                                        border = TRUE)
grDevices::png(filename="heatmap.png", height=400, width=600)
ComplexHeatmap::draw(complex_heat)
grDevices::dev.off()

This is all good, the heatmap produced below has my sample IDs ordered correctly with virginica on the left (in blue as specified by col_vector), and setosa on the right (in red as specified by col_vector):

test1

4- Make the heatmap with column_split; if I just uncomment the column_split line that specifies to split the column by the Species variable, I woould expect the exact same heatmap, with just a gap between the two Species groups. However, the column_order is ignored, and setosa samples appear on the left...

complex_heat <- ComplexHeatmap::Heatmap(heat_mat, cluster_rows = FALSE, cluster_columns = FALSE,
                                        row_dend_width = grid::unit(2, "inch"),
                                        rect_gp = grid::gpar(col = "white", lwd = 2),
                                        col = palette,
                                        top_annotation = column_ha,
                                        column_order = column_order,
                                        column_split = column_split,
                                        column_gap = grid::unit(0.1, "inch"),
                                        border = TRUE)

test2

... you could argue that column_split needs to be a character vector instead of a factor, or that it needs to be in the same order as the samples in column_order (which should not be the case, it should map to the heat_mat and meta_df order).

Obviously enough, if you just do column_split=rev(column_split) and try to produce the heatmap again, the sample grouping gets messed up:

test3

I am at a loss... this should be extremely easy to accomplish, but I cannot work it out, and in my real world case it's not just a cosmetic option.

Anybody has a clue, or is this just a bug of the package and this cannot be accomplished? Thanks!


Solution

  • In case someone needs this, column_split should be a factor with the same order as meta_df$Species, but the levels specified in the order we want them plotted...

    column_split=rev(column_split)) or levels(column_split)=rev(levels(column_split)) (what I was trying) do not work cause they mess up the sample group assignments as the order with respect to meta_df$Species changes.

    The solution is to define column_split like this:

    column_split=factor(as.character(meta_df$Species), levels=c('virginica','setosa'))

    ... instead of column_split <- meta_df$Species