Search code examples
rsortingggplot2geom

geom_tile unable to sort - diagonal


I've certain chromosomal locations called bands. I'm interested to generate combinations of them and then using ggplot's geom_tile plot them. Ideally, they should appear as a diagonal (upper or lower).

However, I'm unable to achieve it. I've tried different approaches but in vain.

chrom_locations <- c("10q23.31", "12p13.33", "12q24.33", "14q32.33", "17q21.33", "19p13.11", "19q13.31", "19q13.42", "19q13.43", "20q13.33", "22q11.22", "22q13.33")


combinations_list <- combn(( chrom_locations), 2, simplify = FALSE)

len_combinations_list<-length(combinations_list) ## 120
df_pairs_pvalue <- data.frame(
  band1 = character(len_combinations_list),      # Numeric column
  band2 = character(len_combinations_list),  # Character column
  count_both = numeric(len_combinations_list)

)


df_pairs_pvalue <- data.frame(
  band1 = character(len_combinations_list),      # Numeric column
  band2 = character(len_combinations_list),  # Character column
  count_both = numeric(len_combinations_list)

)
for(x in seq(1,len_combinations_list  )){
  
  ##store in df_paris data frame
  band1<-combinations_list[[x]][1]
  band2<-combinations_list[[x]][2]
  
  df_pairs_pvalue[x,"band1"]<-  band1
  df_pairs_pvalue[x,"band2"]<-  band2
  df_pairs_pvalue[x,"count_both"]<- c(sample(1:100, 1, replace = T)) 
   
}
library(stringr)

sorted_locations <- stringr::str_sort(chrom_locations, numeric = TRUE)

# Reorder the levels of band1 and band2
df_pairs_pvalue$band1 <- factor(df_pairs_pvalue$band1, levels = sorted_locations)
df_pairs_pvalue$band2 <- factor(df_pairs_pvalue$band2, levels = sorted_locations)

ggplot(df_pairs_pvalue, aes(band2, band1)) + geom_tile(aes(fill = count_both ), color = "black") +
  #scale_fill_viridis()+ 
  scale_fill_gradient2(low = "#075AFF",
                       mid = "#FFFFCC",
                       high = "#FF0000",
                       midpoint = 0,        # This sets the middle point of the scale in data terms
                       limits = c(0, 130),   # Replace -3 and 3 with your actual data range min and max
                       breaks = seq(0, 100, 10), # Customize breaks if needed, or adjust according to your data range
                       labels = seq(0, 100, 10)) 
  

I get attached plot. enter image description here

I've tried other way to sort as:

desired_order <-gtools::mixedsort( rownames( df_focal_cna_filtered_xposed))

df_pairs_pvalue_adj$band1 <- factor(df_pairs_pvalue_adj$band1, levels = desired_order)
df_pairs_pvalue_adj$band2 <- factor(df_pairs_pvalue_adj$band2, levels = desired_order)

This also doesn't work.

Any help would be highly appreciated


Solution

  • ## make your factors `ordered` so comparisons like `<` and `>` work
    df_pairs_pvalue$band1 <- factor(df_pairs_pvalue$band1, levels = sorted_locations, ordered = TRUE)
    df_pairs_pvalue$band2 <- factor(df_pairs_pvalue$band2, levels = sorted_locations, ordered = TRUE)
    
    ## use pmin and pmax to get the row-wise min and max factor values
    df_pairs_pvalue$minband = with(df_pairs_pvalue, pmin(band1, band2))
    df_pairs_pvalue$maxband = with(df_pairs_pvalue, pmax(band1, band2))
    
    ## use those row-wise min and max values as the x and y aesthetics
    
    ggplot(df_pairs_pvalue, aes(minband, maxband)) + geom_tile(aes(fill = count_both ), color = "black") +
      #scale_fill_viridis()+ 
      scale_fill_gradient2(low = "#075AFF",
                           mid = "#FFFFCC",
                           high = "#FF0000",
                           midpoint = 0,        # This sets the middle point of the scale in data terms
                           limits = c(0, 130),   # Replace -3 and 3 with your actual data range min and max
                           breaks = seq(0, 100, 10), # Customize breaks if needed, or adjust according to your data range
                           labels = seq(0, 100, 10)) 
    

    enter image description here