Search code examples
pythoncnumpystructpython-cffi

How to handle member padding in C struct when reading cffi.buffer with numpy.frombuffer?


I have to read an array of C structs returned from a dll and convert it to a Numpy array. The code uses Python's cffi module.

The code works so far but I don't know how to handle the member padding in the struct that np.frombuffer complains about:

ValueError: buffer size must be a multiple of element size

This is my code:

from cffi import FFI
import numpy as np

s = '''
    typedef struct
    {
        int a;
        int b;
        float c;
        double d;
    } mystruct;
    '''

ffi = FFI()
ffi.cdef(s)

res = []

#create array and fill with dummy data
for k in range(2):

    m = ffi.new("mystruct *")

    m.a = k
    m.b = k + 1
    m.c = k + 2.0
    m.d = k + 3.0

res.append(m[0])

m_arr = ffi.new("mystruct[]", res)

print(m_arr)

# dtype for structured array in Numpy
dt = [('a', 'i4'),
      ('b', 'i4'),
      ('c', 'f4'),
      ('d', 'f8')]

# member size, 20 bytes
print('size, manually', 4 + 4 + 4 + 8)

# total size of struct, 24 bytes
print('sizeof', ffi.sizeof(m_arr[0]))

#reason is member padding in structs

buf = ffi.buffer(m_arr)
print(buf)

x = np.frombuffer(buf, dtype=dt)
print(x)

Any ideas how to handle this in a clean way?


Edit:

It seems to work if I add an additional number to the dtype where the padding is supposed to happen:

dt = [('a', 'i4'),
      ('b', 'i4'),
      ('c', 'f4'),
      ('pad', 'f4'),
      ('d', 'f8')]

Why does the padding happen there? (Win7, 64-bit, Python 3.4 64-bit).

But that can't be the best way. The real code is much more complicated and dynamic, so it should be possible to handle this somehow, right?


Solution

  • The probably most convenient way is to use the keyword align=True in the numpy dtype constructor. That will do the padding automatically.

    dt = [('a', 'i4'),
          ('b', 'i4'),
          ('c', 'f4'),
          ('d', 'f8')]
    
    dt_obj = np.dtype(dt, align=True)
    x = np.frombuffer(buf, dtype=dt_obj)
    

    (see also Numpy doc on structured arrays)