Search code examples
pythonstructunpack

Unpacking 4-byte unsigned integers from binary file gives struct.error: unpack requires a buffer of 4 bytes


This is a repeated question but I couldn't find answer

I am reading a binary file in following format (just showing one of the 3000 lines in hex format as shown in sublime text):

0009 7f71 0009 b87b 0009 f24b 000a 2ce2

I want to read it as tuple of 4-byte unsigned integers

if filename:
    with open(filename, mode='rb') as file:
        fileData = file.read()
        unsignedData = struct.unpack('I', fileData )

However I get following error for last line in above code:

struct.error: unpack requires a buffer of 4 bytes

How to fix this?


Solution

  • Decoding a few integers

    The io.RawIOBase.read will usually return a lot more than 4 bytes (it's limited by the system call and/or file size).

    On the other hand, the buffer's size in bytes must match the size required by the format string to struct.unpack.

    The overall data structure of your file is not clear, but to, for example, read 4 unsigned 32-bit integers encoded in little-endian (data you provided), you should slice the buffer:

    unsignedData = struct.unpack('4I', fileData[:16])
    

    Decoding a stream

    In case you need to decode an arbitrarily long stream of integers from your file, there are a few options, depending on the expected data length.

    Short stream

    with open(filename, mode='rb') as fp:
        fileData = fp.read()
        n, r = divmod(len(fileData), struct.calcsize('I'))
        assert r == 0, "Data length not a multiple of int size"
        unsignedData = struct.unpack('I' * n, fileData)
    

    Long stream

    You could use struct.iter_unpack, but it probably makes more sense to keep the large int data in an array.array or numpy.array anyway.

    Here's an example for a NumPy array:

    import numpy
    data = numpy.fromfile(filename, dtype='uint32')
    

    I'll leave the loading into Python's homogeneous data array as an exercise to the reader (hint: array.fromfile).