I opened this issue in the ComplexHeatmap github, but I got no response so far, and was hoping someone here could lend a hand...
I am trying to make the simplest heatmap possible, but the behavior is not what I expect. I just want to produce a heatmap with no column clustering, and the samples in a specific group order (non-alphabetical), with groups separated by a gap.
The problem is, when I want to add such gap, the order is ignored, and defaults to an alphabetical order.
Check the following MWE with the iris
data.
1- Create my data matrix (small subset of iris
for just setosa
and virginica
Species), and a meta information data frame with just sample ID and Species (my grouping variable):
data(iris)
my_setosa=subset(iris, Species=="setosa")
my_virginica=subset(iris, Species=="virginica")
set.seed(123)
rows_used1 <- sort(sample(1:nrow(my_setosa), 5, replace = F))
rows_used2 <- sort(sample(1:nrow(my_virginica), 5, replace = F))
my_iris=rbind(my_setosa[rows_used1,], my_virginica[rows_used2,])
#
heat_mat=t(as.matrix(my_iris[,-ncol(my_iris)]))
meta_df=data.frame(ID=paste0("id",rownames(my_iris)), Species=my_iris[,ncol(my_iris)])
meta_df$Species=droplevels(meta_df$Species)
colnames(heat_mat)=meta_df$ID
These look like this:
> heat_mat
id3 id14 id15 id31 id42 id114 id125 id137 id143 id150
Sepal.Length 4.7 4.3 5.8 4.8 4.5 5.7 6.7 6.3 5.8 5.9
Sepal.Width 3.2 3.0 4.0 3.1 2.3 2.5 3.3 3.4 2.7 3.0
Petal.Length 1.3 1.1 1.2 1.6 1.3 5.0 5.7 5.6 5.1 5.1
Petal.Width 0.2 0.1 0.2 0.2 0.3 2.0 2.1 2.4 1.9 1.8
> meta_df
ID Species
1 id3 setosa
2 id14 setosa
3 id15 setosa
4 id31 setosa
5 id42 setosa
6 id114 virginica
7 id125 virginica
8 id137 virginica
9 id143 virginica
10 id150 virginica
2- Define heatmap and Species grouping colors, column annotation (Species groups), column order and column split (gap between Species groups):
palette=grDevices::colorRampPalette(c("green","black","red"))(11)
col_vec=c("red","blue")
col_vec=stats::setNames(col_vec, levels(meta_df$Species))
column_ha <- ComplexHeatmap::HeatmapAnnotation(
Species = meta_df$Species,
col = list(Species = col_vec),
show_legend = TRUE)
column_order=meta_df[order(meta_df$Species, decreasing=T),]$ID
column_split <- meta_df$Species
Note here (and this is the problem), that I want the virginica
group on the left, and the setosa
group on the right, so my column order is the following:
> column_order
[1] "id114" "id125" "id137" "id143" "id150" "id3" "id14" "id15" "id31"
[10] "id42"
3- Make the heatmap without column_split
:
complex_heat <- ComplexHeatmap::Heatmap(heat_mat, cluster_rows = FALSE, cluster_columns = FALSE,
row_dend_width = grid::unit(2, "inch"),
rect_gp = grid::gpar(col = "white", lwd = 2),
col = palette,
top_annotation = column_ha,
column_order = column_order,
#column_split = column_split,
column_gap = grid::unit(0.1, "inch"),
border = TRUE)
grDevices::png(filename="heatmap.png", height=400, width=600)
ComplexHeatmap::draw(complex_heat)
grDevices::dev.off()
This is all good, the heatmap produced below has my sample IDs ordered correctly with virginica
on the left (in blue as specified by col_vector
), and setosa
on the right (in red as specified by col_vector
):
4- Make the heatmap with column_split
; if I just uncomment the column_split
line that specifies to split the column by the Species variable, I woould expect the exact same heatmap, with just a gap between the two Species groups. However, the column_order
is ignored, and setosa
samples appear on the left...
complex_heat <- ComplexHeatmap::Heatmap(heat_mat, cluster_rows = FALSE, cluster_columns = FALSE,
row_dend_width = grid::unit(2, "inch"),
rect_gp = grid::gpar(col = "white", lwd = 2),
col = palette,
top_annotation = column_ha,
column_order = column_order,
column_split = column_split,
column_gap = grid::unit(0.1, "inch"),
border = TRUE)
... you could argue that column_split
needs to be a character vector instead of a factor, or that it needs to be in the same order as the samples in column_order
(which should not be the case, it should map to the heat_mat
and meta_df
order).
Obviously enough, if you just do column_split=rev(column_split)
and try to produce the heatmap again, the sample grouping gets messed up:
I am at a loss... this should be extremely easy to accomplish, but I cannot work it out, and in my real world case it's not just a cosmetic option.
Anybody has a clue, or is this just a bug of the package and this cannot be accomplished? Thanks!
In case someone needs this, column_split
should be a factor with the same order as meta_df$Species
, but the levels specified in the order we want them plotted...
column_split=rev(column_split))
or levels(column_split)=rev(levels(column_split))
(what I was trying) do not work cause they mess up the sample group assignments as the order with respect to meta_df$Species
changes.
The solution is to define column_split
like this:
column_split=factor(as.character(meta_df$Species), levels=c('virginica','setosa'))
... instead of column_split <- meta_df$Species