Search code examples
rbioinformaticsgenome

How do I rewrite this expected depth of (genome) coverage function in R?


I need to draw the probability density for a random position for Length of fragment = 600, Genome size = 3 × 109, and Number of reads = 10 million reads

depth_of_coverage <- function(genome = 3E9, fragment_length = 600, reads = 10E6) {
  depth <- 0
  for(i in 1:reads){
    random_position <- sample(1:genome, 1)
    coverage <- sample(1:genome, fragment_length)
    depth <- depth + sum(coverage == random_position)
  }
  return(depth)
}

depth_of_coverage()

This takes a long time to run...


Solution

  • Sample the binomial distribution using rbinom():

    depth_of_coverage <- function(genome = 3E9, fragment_length = 600, reads = 10E6, sims = 1) {
      rbinom(sims, reads, fragment_length/genome)
    }
    

    This will be many orders of magnitude faster than your current approach.