Search code examples
rfunctionmathcoordinatesbioconductor

Finding the coordinates from a BCV plot in edgeR


How do I know the function(x) and consequently, the coordinates of the function that generate the graph of dispersion (BCV) from edgeR package?

I need to know "the coordinates" of the Vertex (max and min value for each Vertex) of the "trend" dispersion and where the "trend line (blue)" intersect the common dispersion line (red line)

enter image description here

For convenience, I did an example:

data <- "chr start end depth depth1 depth2 depth3 depth4 depth5 depth6
chr1 3273 3273 7 200 35 1 200 850 0
chr1 3274 3274 3 50 25 5 300 1500 2
chr1 3275 3275 8 600 15 8 100 300 5
chr1 3276 3276 4 30 2 10 59 20 0
chr1 3277 3277 25 20 7 4 600 45 0"
data <- read.table(text=data, header=T)

datamatrix <- data [, c(4:10)]
library (edgeR)

#grouping factor
group <- c(1, 2, 2, 2, 2, 2, 2) #groups of each sample)

#create a DGEList
y <- DGEList(counts=datamatrix,group=group)

According to edgeR package, I can estimate the dispersion of my dataset using:

y <- estimateDisp(y) 

y$common.dispersion

The square root of the common dispersion gives the coefficient of biological variation. And, the dispersion estimates can be viewed in a BCV plot:

plotBCV(y)

Solution

  • I think this solves the trick

    #finding dispersion coordinates
    
        afun <- approxfun(y$AveLogCPM, y$trended.dispersion)
    
    #ranging between -2 and 12 the "par" number to check the intersection
        optim(par=-1, function(x) { (afun(x) - y$common.dispersion)^2 }) #-0.9056
        optim(par=0, function(x) { (afun(x) - y$common.dispersion)^2 }) #-0.4955
        optim(par=2, function(x) { (afun(x) - y$common.dispersion)^2 }) #1.8237
    
    #with the plotBCV function code, I have the access to tagwise, trended, common dispersions, and the abundance values, for each CpG as: y$tagwise.dispersion, y$trended.dispersion, y$common.dispersion, and y$AveLogCPM, respectively. 
    
    min (y$trended.dispersion) #[1] 1.09009
    max (y$trended.dispersion) #[1] 1.50691
    
    #to the vertex, you can subset a table according to the AveLogCPM range that you want to know the max and min like this:
    
    afuntable<- data.table(y$AveLogCPM, y$trended.dispersion)
    afuntablemod <-subset (afuntable, V1 >= 0)
    afuntablemod <-subset (afuntablemod, V1 <= 2)
    min (afuntablemod$V2) #[1] 1.318002
    
    #and again
    afuntable<- data.table(y$AveLogCPM, y$trended.dispersion)
    afuntablemod <-subset (afuntable, V1 >= 2)
    afuntablemod <-subset (afuntablemod, V1 <= 12)
    max (afuntablemod$V2) #[1] 1.50691