Search code examples
perlsortinghashalphanumeric

Sorting hash keys by Alphanumeric sort


I have just read the post Sorting alphanumeric hash keys in Perl?. But I am starting with Perl, and I don't understand it very clearly.

So I have a hash like this one:

  %hash = (
        "chr1" => 1,
        "chr2" => 3,
        "chr19" => 14,
        "chr22" => 1,
        "X" => 2,
    )

I'm trying to obtain output like this:

chr1
chr2
chr19
chr22

But I'm obtaining output like this:

chr1
chr19
chr2
chr22

I have written this code, but it is creating the above wrong output:

foreach my $chr (sort {$a cmp $b} keys(%hash)) {
    my $total= $hash{$chr};
    my $differentpercent= ($differenthash{$chr} / $total)*100;
    my $round=(int($differentpercent*1000))/1000;
    print "$chr\t$hash{$chr}\t$differenthash{$chr}\t$round\n";
}

It prints:

chr1    342421    7449    2.175
chr10    227648    5327    2.34
chr11    220415    4468    2.027
chr12    213263    4578    2.146
chr13    172379    3518    2.04
chr14    143534    2883    2.008
chr15    126441    2588    2.046
chr16    138239    3596    2.601
chr17    122137    3232    2.646
chr18    130275    3252    2.496
chr19    99876    2836    2.839
chr2    366815    8123    2.214

How can I fix this?


Solution

  • Update Note @Miller's comment below on some shortcomings of the Sort::Naturally module.

    What you are asking for is a relatively complicated sort that splits each string into alphabetical and numeric fields, and then sorts the letters lexically and the numbers by value.

    The module Sort::Naturally will do what you ask, or you can write something like this. You appear to have ignored the X key, so I have sorted it to the end using a case-independent sort.

    use strict;
    use warnings;
    
    my %hash = map { $_ => 1 } qw(
        chr22  chr20  chr19  chr13  chr21  chr16  chr12  chr10  chr18
        chr17  chrY   chr5   chrX   chr8   chr14  chr6   chr3   chr9
        chr1   chrM   chr11  chr2   chr7   chr4   chr15
    );
    
    my @sorted_keys = sort {
        my @aa = $a =~ /^([A-Za-z]+)(\d*)/;
        my @bb = $b =~ /^([A-Za-z]+)(\d*)/;
        lc $aa[0] cmp lc $bb[0] or $aa[1] <=> $bb[1];
    } keys %hash;
    
    print "$_\n" for @sorted_keys;
    

    output

    chr1
    chr2
    chr3
    chr4
    chr5
    chr6
    chr7
    chr8
    chr9
    chr10
    chr11
    chr12
    chr13
    chr14
    chr15
    chr16
    chr17
    chr18
    chr19
    chr20
    chr21
    chr22
    chrM
    chrX
    chrY
    

    Using the Sort::Naturally module (you will probably have to install it) you could write this instead.

    use strict;
    use warnings;
    
    use Sort::Naturally;
    
    my %hash = map { $_ => 1 } qw(
        chr22  chr20  chr19  chr13  chr21  chr16  chr12  chr10  chr18
        chr17  chrY   chr5   chrX   chr8   chr14  chr6   chr3   chr9
        chr1   chrM   chr11  chr2   chr7   chr4   chr15
    );
    
    my @sorted_keys = nsort keys %hash;
    
    print "$_\n" for @sorted_keys;
    

    The output is identical to the above.