Search code examples
rggplot2plotlyheatmapheatmaply

How to add custom text per column of a heatmap in R?


I have a dataset where I am plotting a heatmap to compare 7 groups. I also have per group 2 columns of data that describe the group. I am trying to create an interactive plot that shows each group's information per its information columns.

Here is an example of the data where 7 groups each have 2 columns of corresponding information:

df <- structure(list(Group1 = c(9.420318259, 5.801092847, 4.890727291, 
4.589825753, 4.836092781), Group2 = c(14.57805564, 8.798453748, 
7.982599836, 7.951599435, 10.81418654), Group3 = c(14.49131554, 
7.975284646, 8.258878348, 7.922657108, 13.3205827), Group4 = c(11.44447147, 
6.208332721, 6.529806574, 4.882623805, 10.69676399), Group5 = c(22.86835197, 
10.94297858, 7.197041788, 9.237584441, 12.70083108), Group6 = c(10.62687539, 
6.458410247, 7.461916094, 6.308454021, 12.39464562), Group7 = c(11.09404106, 
6.420303272, 6.821000583, 5.0727153, 11.13903127), Group1_Genes = c(46L, 
17L, 23L, 16L, 27L), Group1_Score = c(0.719, 0.757, 0.71, 0.807, 
0.761), Group2_Genes = c(58L, 22L, 30L, 22L, 40L), Group2_Score = c(0.754, 
0.766, 0.741, 0.807, 0.774), Group3_Genes = c(37L, 14L, 14L, 
13L, 22L), Group3_Score = c(0.798, 0.788, 0.81, 0.879, 0.805), 
    Group4_Genes = c(55L, 20L, 29L, 21L, 42L), Group4_Score = c(0.774, 
    0.768, 0.741, 0.822, 0.781), Group5_Genes = c(71L, 24L, 37L, 
    23L, 53L), Group5_Score = c(0.766, 0.767, 0.765, 0.811, 0.771
    ), Group6_Genes = c(69L, 24L, 37L, 23L, 53L), Group6_Score = c(0.772, 
    0.767, 0.765, 0.811, 0.771), Group7_Genes = c(58L, 21L, 33L, 
    22L, 48L), Group7_Score = c(0.79, 0.78, 0.774, 0.817, 0.78
    )), row.names = c("Cardiac Hypertrophy", 
"Cellular Effects of Adrenaline", "Metastasis Signaling", 
"Hormone Signaling", "Estrogen Receptor Signaling"
), class = "data.frame")
#One row of this data looks like:
Pathway  Group1  Group2  Group3  Group4  Group5  Group6  Group7  Group1_Score  Group1_Genes  Group2_Score  Group2_Genes ...
Cardiac  0.7      0.8      0.5    0.7      0.3    0.6     0.6        0.6           34           0.4     65

I am trying to plot groups1-7 (also columns 1:7) in a heatmap then have hover text using the rest columns by modifying an answer in another question (How to create an interactive heatmaply plot with custom text in R?) in a heatmap with:

groups <- as.matrix(df[,1:7]) 

labels1 <- 
  df  %>% 
  mutate(label1 = paste(
    "Gene Overlap:", Group1_Genes,
    "\nMean_GB_Score:", Group1_Score
  )) %>% 

  transmute(across(Group1, ~label1)) %>% 
  as.matrix()

labels2 <- 
  df  %>% 
  mutate(label2 = paste(
    "Gene Overlap:", Group2_Genes,
    "\nMean_GB_Score:", Group2_Score
  )) %>% 

  transmute(across(Group2, ~label2)) %>% 
  as.matrix()


#I repeat making 7 labels objects to then cbind:

labels = cbind(labels1, labels2, labels3, labels4, labels5, labels6, labels7)

heatmaply(groups, custom_hovertext = labels,
          file = "heatmaply_plot.html",
          scale_fill_gradient_fun = ggplot2::scale_fill_gradient2(
  low = "pink", 
  high = "red"))

However trying this produces an error:

Error in custom_hovertext[rowInd, colInd, drop = FALSE] : 
  subscript out of bounds

Is there a way for me to create custom_text in heatmaply() that specifies hover text information given per each column of the heatmap, as opposed to global information given to each heatmap square?


Solution

  • labels_df <- 
      df %>% 
      select(ends_with("Score"), ends_with("Genes")) %>% 
      rownames_to_column() %>% 
      pivot_longer(-rowname) %>% 
      separate(name, c("Group", "var")) %>% 
      pivot_wider(c(rowname, Group), names_from = var, values_from = value) %>% 
      mutate(label = paste(
        "Gene Overlap:", Genes,
        "\nMean_GB_Score:", Score
      )) %>% 
      pivot_wider(rowname, names_from = Group, values_from = label)
    
    

    You can check out what happens at each step by breaking the chain at any place and running the code. But basically we are just making some transposes to have the data in a more usable tidy format such that to calculate label we don't need to type in 7 similar expressions. And then we transpose back to the format needed for heatmaply.

    Important thing to mention here is that after all these transposes the rows happen to be in the same order as they were at the beginning. This is cool, but it's better to check such things.

    Labels in the matrix form:

    labels_mat <- 
      labels_df %>% 
      select(Group1:Group7) %>% 
      as.matrix()
    

    And finally:

    heatmaply(
      groups,
      custom_hovertext = labels_mat,
      scale_fill_gradient_fun = ggplot2::scale_fill_gradient2(low = "pink", high = "red")
    )