Search code examples
sortingperl

How to sort an array of ASCII hex strings numerically?


I feel like this is a simple problem. But I am stuck and to be honest my mind is having an info overload at the moment. I am also new to Perl and I'm still trying to explore it.

So I have a variable that contains a string:

$strings =
"3A
1B
2A
5A
4B"

then I converted it into an array by splitting it:

@split_array = split("\n", $strings);

print @split_array . "\n";

Output:

3A
1B
2A
5A
4B

Now, I want to sort this array in ascending order.

Here's what I tried:

@sorted_array = sort @split_array;

Expected Output:

1B
2A
3A
4B
5A

but the output is still the same.

I apologize if I missed something obvious here.

Any help would be appreciated.


Solution

  • This works fine for me

    use strict;
    use warnings;
    
    my $strings = "3A 
    1B 
    2A 
    5A 
    4B";
    
    my @split_array = split(" ", $strings);
    
    print "@split_array\n";
    
    my @sorted_array = sort @split_array;
    print "@sorted_array\n";
    

    output is

    3A 1B 2A 5A 4B
    1B 2A 3A 4B 5A
    

    Note that this is doing a string sort, rather than a numeric sort. It works in this instance because you are only sorting 2 character ASCII hex strings that happen to sort in proper numeric order. If the format is different, say the values contain 3 character ASCII hex, that string sort will not work properly.

    For example

    use strict;
    use warnings;
    
    my $strings = "3A 
    100 
    1B 
    2A 
    5A 
    4B";
    
    my @split_array = split(" ", $strings);
    
    print "@split_array\n";
    
    my @sorted_array = sort @split_array;
    

    outputs

    3A 100 1B 2A 5A 4B
    100 1B 2A 3A 4B 5A
    

    To fix that you need to do a proper numeric sort. That means converting the ASCII hex numbers into their binary equivalent.

    use strict;
    use warnings;
    
    my $strings = "3A 
    100 
    1B 
    2A 
    5A 
    4B";
    
    my @split_array = split(" ", $strings);
    
    print "@split_array\n";
    
    my @sorted_array = sort @split_array;
    print "@sorted_array\n";
    
    my @numeric_sorted_array = sort { hex $a <=> hex $b } @split_array;
    print "@numeric_sorted_array\n";
    

    output is

    3A 100 1B 2A 5A 4B
    100 1B 2A 3A 4B 5A
    1B 2A 3A 4B 5A 100
    

    The key here is the sort { hex $a <=> hex $b } expression -- this converts the ASCII hex into binary with the hex function, then uses the <=> operator to carry out a numeric sort.

    [EDIT]

    If the input data contains non-hex digits, you need to clean up the data before using the hex function to prevent the Illegal hexadecimal digit warning message as shown below

    $ perl -e 'use strict; use warnings;  print hex("10Y") . "\n"'
    Illegal hexadecimal digit 'Y' ignored at -e line 1.
    16
    

    If the requirement is to have a sorted list of ASCII hex digits with all non-hex digits removed, the code can be updated like this

    use strict;
    use warnings;
    
    my $strings = "3A 
    100 
    1BY
    2A 
    5A 
    4B";
    
    my @split_array = split(" ", $strings);
    
    print "Unsorted: @split_array\n";
    
    my @sorted_array = sort @split_array;
    print "Default sort: @sorted_array\n";
    
    my @numeric_sorted_array = sort { hex $a <=> hex $b } @split_array;
    print "Numeric Sort: @numeric_sorted_array\n";
    
    my @cleaned_numeric_sorted_array = sort { hex $a <=> hex $b } 
                                       map  { s/[^0-9a-f]+//ir  } 
                                       @split_array;
    print "Cleaned: @cleaned_numeric_sorted_array\n";
    

    output is

    Unsorted: 3A 100 1BY 2A 5A 4B
    Default sort: 100 1BY 2A 3A 4B 5A
    Illegal hexadecimal digit 'Y' ignored at /tmp/fred.pl line 19.
    Illegal hexadecimal digit 'Y' ignored at /tmp/fred.pl line 19.
    Illegal hexadecimal digit 'Y' ignored at /tmp/fred.pl line 19.
    Numeric Sort: 1BY 2A 3A 4B 5A 100
    Cleaned: 1B 2A 3A 4B 5A 100
    

    the magic happens with the map function -- this removes all non-hex digits before feeding the data into sort