Search code examples
pythonperlunpack

Unpack format characters in Python


I need the Python analog for this Perl string:

unpack("nNccH*", string_val)

I need the nNccH* - data format in Python format characters.

In Perl it unpack binary data to five variables:

  • 16 bit value in "network" (big-endian)
  • 32 bit value in "network" (big-endian)
  • Signed char (8-bit integer) value
  • Signed char (8-bit integer) value
  • Hexadecimal string, high nibble first

But I can't do it in Python

More:

bstring = ''
while DataByte = client[0].recv(1):
    bstring += DataByte
print len(bstring)
if len(bstring):
    a, b, c, d, e = unpack("nNccH*", bstring)

I never wrote in Perl or Python, but my current task is to write a multithreading Python server that was written in Perl...


Solution

  • The Perl format "nNcc" is equivalent to the Python format "!HLbb". There is no direct equivalent in Python for Perl's "H*".

    There are two problems.

    • Python's struct.unpack does not accept the wildcard character, *
    • Python's struct.unpack does not "hexlify" data strings

    The first problem can be worked-around using a helper function like unpack.

    The second problem can be solved using binascii.hexlify:

    import struct
    import binascii
    
    def unpack(fmt, data):
        """
        Return struct.unpack(fmt, data) with the optional single * in fmt replaced with
        the appropriate number, given the length of data.
        """
        # http://stackoverflow.com/a/7867892/190597
        try:
            return struct.unpack(fmt, data)
        except struct.error:
            flen = struct.calcsize(fmt.replace('*', ''))
            alen = len(data)
            idx = fmt.find('*')
            before_char = fmt[idx-1]
            n = (alen-flen)//struct.calcsize(before_char)+1
            fmt = ''.join((fmt[:idx-1], str(n), before_char, fmt[idx+1:]))
            return struct.unpack(fmt, data)
    
    data = open('data').read()
    x = list(unpack("!HLbbs*", data))
    # x[-1].encode('hex') works in Python 2, but not in Python 3
    x[-1] = binascii.hexlify(x[-1])
    print(x)
    

    When tested on data produced by this Perl script:

    $line = pack("nNccH*", 1, 2, 10, 4, '1fba');
    print "$line";
    

    The Python script yields

    [1, 2, 10, 4, '1fba']