Search code examples
pythondoubleamf

Decoding an IEEE double precision float (8 byte)


I'm decoding an AMF0 format file. The data I'm looking for is a timestamp, encoded as an array. [HH, MM, SS].

Since the data is AMF0, I can locate the start of the data by reading in the file as bytes, converting each byte to hex, and looking for the signal 08 00 00 00 03, an array of length 3.

My problem is that I don't know how to decode the 8-byte integer in each element of the array. I have the data in the same, hex-encoded format, e.g.:

08 00 00 00 03 *signals array length 3*
00 01 30 00 00 00 00 00 00 00 00 00 *signals integer*
00 01 31 00 00 00 00 00 00 00 00 00 *signals integer*
00 01 32 00 40 3C 00 00 00 00 00 00 *signals integer*
00 00 09 *signals object end*

This should be decoded as [0, 0, 28] (if minerva is to be believed).
I've been trying to use struct.unpack, but all the examples I see are for 4-byte (little endian) values.


Solution

  • The format specifier you are looking for is ">9xd4xd4xd3x":

    >>> import struct
    >>> from binascii import unhexlify
    >>> struct.unpack(">9xd4xd4xd3x", unhexlify("080000000300013000000000000000000000013100000000000000000000013200403C000000000000000009"))
    (0.0, 0.0, 28.0)
    

    Broken down:

    1. >: big endian format
    2. 5x: 5 bytes begin-of-array marker + size (ignored)
    3. 4x: 4 bytes begin-of-element marker (ignored)
    4. d: 1 big endian IEEE-754 double
    5. points 2-3 for other 2 elements
    6. 3x: 3 bytes end-of-array marker (ignored)

    Points 1. and 2. are merged together into 9x.

    As you might have noticed, struct can only ignore extra bytes, not validate. If you need more flexibility in the input format, you could use a regex matching begin/end array markers in non-greedy mode.