Search code examples
rannotationsheatmapcategorizationpheatmap

Automatically categorize and add annotations using pheatmap in R


I have a dataframe made by the school grades of some students in different subjects. The students are also characterized by their gender (F or M), that is included as a suffix in their names (e.g. Anne_F, Albert_M, etc...) With these data I have created an heatmap with the package pheatmap(), in this way:

library(pheatmap)

  Anne_F <- c(9,7,6,10,6)
  Carl_M <- c(6,7,9,5,7)
  Albert_M <- c(8,8,8,7,9)
  Kate_F <- c(10,5,10,9,5)
  Emma_F <- c(6,8,10,8,7)
  
  matrix <- cbind(Anne_F, Carl_M, Albert_M, Kate_F, Emma_F)
  rownames(matrix) <- c("Math", "Literature", "Arts", "Science", "Music")
  
  print(matrix)
  
  heatmap <- pheatmap(
     mat = matrix,
     cluster_rows = F,
     cluster_cols = F,  
     cellwidth = 30,
     cellheight = 30,
  )
  
heatmap

Which gives this matrix

enter image description here

and the relative plot:

enter image description here

Now I would like to automatically recognize if a student is Male or Female and add this as a column annotation in the heatmap, in order to have a graph like this:

enter image description here

I have thought to create two vectors, one with the name of the students: name <- c("Anne", "Carl", "Albert", "Kate", "Emma") and one with the respective genders: gender <- c("F", "M", "M", "F", "F") , but I can't figure out how to associate names with genders, and to show them on the heatmap.

I don't mean to manually associate one-name to one-gender (as Anne to F, Albert to M, etc,). I need to take the entire vector of names and associate it with the corresponding vector of genders (and then annotate them on the heatmap), because their number will increase in the future.

Many thanks in advance for your help.


Solution

  • You need to use annotation_col option in pheatmap.

    library(pheatmap)
    
    # split matrix into "Name" and "Gender"
    name_gender_matrix <- str_split_fixed(colnames(matrix), "_", 2)
    
    # Data that maps to the heatmap should be set at the rownames
    annot_col <- data.frame(row.names = name_gender_matrix[, 1], Gender = name_gender_matrix[, 2])
    
    # Align the column name of your matrix and with the annotation
    colnames(matrix) <- rownames(annot_col)
    
    heatmap <- pheatmap(
      mat = matrix,
      cluster_rows = F,
      cluster_cols = F,  
      cellwidth = 30,
      cellheight = 30,
      annotation_col = annot_col
    )
    

    pheatmap_with_gender