Search code examples
roverlapiranges

Finding overlap in dataframe ranges in R


I have two bedfiles as dataframes in R, for which I want to map all overlapping regions to each other (similar to what bedtools closest would be able to do).

BedA:

chr   start   end
 2       100     500
 2       200     250
 3       275     300

BedB:

chr    start    end
  2       210      265
  2       99       106
  8       275      290

BedOut:

chr   start.A   end.A  start.B  end.B
 2       100     500      210      265
 2       100     500      99       106
 2       200     250      210      265

Now, I found this very similar question, which suggest to use iRanges. Using the proposed way seems works, but I have no idea how to turn the output into a data frame like "BedOut".


Solution

  • Another data.table option using foverlaps:

    setkeyv(BedA, names(BedA))
    setkeyv(BedB, names(BedB))
    ans <- foverlaps(BedB, BedA, nomatch=0L)
    setnames(ans, c("start","end","i.start","i.end"), c("start.A","end.A","start.B","end.B"))
    

    output:

       chr start.A end.A start.B end.B
    1:   2     100   500      99   106
    2:   2     100   500     210   265
    3:   2     200   250     210   265
    

    data:

    library(data.table)
    BedA <- fread("chr   start   end
    2       100     500
    2       200     250
    3       275     300")
    
    BedB <- fread("chr    start    end
    2       210      265
    2       99       106
    8       275      290")