Search code examples
bioinformaticsintersectiongenomesequencingbedtools

How to generate a custom bed file to use for bedtools intersect?


I have a custom reference genome, gene.fa and 18 bed files. I want to generate a bed file that contains a region of interest, 5100-5600 bp, as a single entry that I can use for intersection using bedtools intersect on my 18 bed files.

I was thinking of copy/pasting the region of interest sequence from the reference genome and aligning it to generate my bed file. The problem with this is that my reference genome is a trimer so this sequence is repeated three times and there would be error in the alignment.

Is there a better way to do this? Can you use bedtools intersect with a text file?

I am new to bioinformatics and sequencing so I may be overthinking this problem.


Solution

  • BED files are text files, so if you only have a small number of regions of interest and you know their coordinates, you can write the file with a text editor. See the BED file specification.

    If you only have the sequence of your ROI, you can get the coordinates by aligning it to the genome e.g. by BLAST. If the sequence appears in the genome multiple times, it should not result in errors, but you need to know which alignment is to your true ROI or include them all in the BED file as separate entries.