Search code examples
pythonbinaryfiles

Reading a binary file into a struct


I have a binary file with a known format/structure.

How do I read all the binary data in to an array of the structure?

Something like (in pseudo code)

bytes = read_file(filename)
struct = {'int','int','float','byte[255]'}
data = read_as_struct(bytes, struct)
data[1]
>>> 10,11,10.1,Arr[255]

My solution so far is:

data = []

fmt   = '=iiiii256i'
fmt_s = '=iiiii'
fmt_spec = '256i'

struct_size = struct.calcsize(fmt)

for i in range(struct_size, len(bytes)-struct_size, struct_size):
    dat1= list(struct.unpack(fmt_s, bytes[i-struct_size:i-1024]))
    dat2= list(struct.unpack(fmt_spec, bytes[i-1024:i]))
    dat1.append(dat2)
    data.append(dat1)

Solution

  • Use the struct module; you need to define the types in a string format documented with that library:

    struct.unpack('=HHf255s', bytes)
    

    The above example expects native byte-order, two unsigned shorts, a float and a string of 255 characters.

    To loop over an already fully read bytes string, I'd use itertools; there is a handy grouper recipe that I've adapted here:

    from itertools import izip_longest, imap
    from struct import unpack, calcsize
    
    fmt_s = '=5i'
    fmt_spec = '=256i'
    size_s = calcsize(fmt_s)
    size = size_s + calcsize(fmt_spec)
    
    def chunked(iterable, n, fillvalue=''):
        args = [iter(iterable)] * n
        return imap(''.join, izip_longest(*args, fillvalue=fillvalue))
    
    data = [unpack(fmt_s, section[:size_s]) + (unpack(fmt_spec, section[size_s:]),)
        for section in chunked(bytes, size)]
        
    

    This produces tuples rather than lists, but it's easy enough to adjust if you have to:

    data = [list(unpack(fmt_s, section[:size_s])) + [list(unpack(fmt_spec, section[size_s:]))]
        for section in chunked(bytes, size)]