Search code examples
rdataframebioinformaticsbioconductorgenomicranges

How to calculate length distribution and coverage distribution bed file


I have dataset in bed file and I want to calculate and plot the length and coverage distribution of file. How can I calculate length distribution in R.

df1:

chr21 2800 3270
chr21 3600 4152
chr2 3719 5092
chr22 3893 4547 
chr2 339 5092
chr22 3563 3597 
structure(list(df1c = c("chr21", "chr21", "chr2", "chr22","chr2"), df1c2 = c(2800, 
3600, 3719, 3893,339,3563), df1c3 = c(3270, 4152, 5092, 4547,5092,3597)), class = "data.frame", row.names = c(NA, 
-4L))

Solution

  • You could do:

    library(tidyverse)
    
    df %>% 
      mutate(id = factor(seq(nrow(.)))) %>%
      ggplot(aes(y = df1c2, x = df1c, xend = df1c, yend = df1c3)) +
      geom_segment(aes(y = 1, yend = max(df1c3)), size = 8, lineend = 'round',
                   color = 'gray20') +
      geom_segment(size = 7, aes(color = 'coverage')) +
      coord_flip() + 
      labs(color = '', y = 'Location', x = 'Chromosome') +
      theme_light(base_size = 16)
    

    enter image description here