Search code examples
perlbioinformaticsgenome

Perl sort genomic positions


I have a list of genomic positions in the format chromosome:start-end

for example

chr1:100-110
chr1:1000-1100
chr1:200-300
chr10:100-200
chr2:100-200
chrX:100-200

I want to sort this by chromosome number and numerical start position to get this:

chr1:100-110
chr1:200-300
chr1:1000-1100
chr2:100-200
chr10:100-200
chrX:100-200

What is a good and efficient way to do this in perl?


Solution

  • Just use the module Sort::Keys::Natural:

    use strict;
    use warnings;
    
    use Sort::Key::Natural qw(natsort);
    
    print natsort <DATA>;
    
    __DATA__
    chr1:100-110
    chr1:1000-1100
    chr1:200-300
    chr10:100-200
    chr2:100-200
    chrX:100-200
    chrY:100-200
    chrX:1-100
    chr10:100-150
    

    Outputs:

    chr1:100-110
    chr1:200-300
    chr1:1000-1100
    chr2:100-200
    chr10:100-150
    chr10:100-200
    chrX:1-100
    chrX:100-200
    chrY:100-200