Search code examples
pythonstructencodingutf-16binascii

Pack into c types and obtain the binary value back


I'm using the following code to pack an integer into an unsigned short as follows,

raw_data = 40

# Pack into little endian
data_packed = struct.pack('<H', raw_data)

Now I'm trying to unpack the result as follows. I use utf-16-le since the data is encoded as little-endian.

def get_bin_str(data):
    bin_asc = binascii.hexlify(data)
    result = bin(int(bin_asc.decode("utf-16-le"), 16))
    trimmed_res = result[2:]
    return trimmed_res

print(get_bin_str(data_packed))

Unfortunately, it throws the following error,

result = bin(int(bin_asc.decode("utf-16-le"), 16)) ValueError: invalid literal for int() with base 16: '㠲〰'

How do I properly decode the bytes in little-endian to binary data properly?


Solution

  • Use unpack to reverse what you packed. The data isn't UTF-encoded so there is no reason to use UTF encodings.

    >>> import struct
    >>> data_packed = struct.pack('<H', 40)
    >>> data_packed.hex()   # the two little-endian bytes are 0x28 (40) and 0x00 (0)
    2800
    >>> data = struct.unpack('<H',data_packed)
    >>> data
    (40,)
    

    unpack returns a tuple, so index it to get the single value

    >>> data = struct.unpack('<H',data_packed)[0]
    >>> data
    40
    

    To print in binary use string formatting. Either of these work work best. bin() doesn't let you specify the number of binary digits to display and the 0b needs to be removed if not desired.

    >>> format(data,'016b')
    '0000000000101000'
    >>> f'{data:016b}'
    '0000000000101000'