Search code examples
perlpack

unpacking a data structure whose first byte indicates length


I am trying to unpack a TLE (Tagged Logical Element) from an IBM AFP format file.

The specification (http://www.afpcinc.org/wp-content/uploads/2017/12/MODCA-Reference-09.pdf) indicates that these are two triplets (even though there are four values) that are structured as follows (with their byte offsets):

0: Tlength | 1: Tid | 2-n: Parameter (= 2: Type + 3: Format + 4-n: EBCDIC encoded String)

Example (with two triplets, one indicating the name and one the value):

0C 02  0B  00   C3 A4 99 99 85 95 83 A8    07 36  00 00    C5 E4 D9
12 KEY UID CHAR  C  u  r  r  e  n  c  y     7 VAL RESERVED  E  U  R

I use Perl to parse it as follows (and successfully):

            if ($key eq 'Data') {
                my $tle = $member->{struct}->{$key};
                my $k_length = hex(unpack('H2', substr($tle, 0, 1)));
                my $key = decode('cp500', substr($tle, 4, $k_length - 4));
                my $v_length = hex(unpack('H2', substr($tle, $k_length, 1)));
                my $value = decode('cp500', substr($tle, $k_length + 4, $v_length - 4));
                print("'$key' => '$value'\n");
            }

Result:

'Currency' => 'EUR'

While the above is successful, I feel that my way is a bit too cpmplicated and that there's a more efficient way to do this. E.g. do pack templates support reading the first n bytes to use as a quantifier for how many successive bytes to unpack? I read the Perl pack tutorial but can't seem to find something along those lines.


Solution

  • If the length field didn't include itself, you could do something like the following:

    (my $record, $unparsed) = unpack("C/a a*", $unparsed);
    my $key = decode("cp500", unpack("x3 a*", $record));
    

    But the length field includes itself.

    (my $length, $unparsed) = unpack("C a*", $unparsed);
    (my $record, $unparsed) = unpack("a".($length-1)." a*", $unparsed);
    my $key = decode("cp500", unpack("x3 a*", $record));