Search code examples
pythonlinuxstructbyte

python struct disalignment windows - linux


I have a working service, written in Python 3.11, that communicates with PLCs using standard sockets and ctypes for packaging / unpackaging data. This app can work both on Windows or Linux, no issue (both 64 bit).

For testing purposes, I've an additional set of scripts that can simulate different conditions with the PLC and other subsystems. One of them uses python struct to pack and unpack data, following the same protocols as the real PLC.

One of the scripts works fine in Windows, not an issue. But in Linux I find a huge difference in size of the buffer.

This is the buffer expected to be received, 5232 bytes including padding. The @ forces padding when required.

bufftype = f"@I29sLIL29sH{4*320}L4LhhIbbi"

(The 4 * 320 + L is intentional, it's a matrix set of 4 longs x 320 rows)

However, in Linux (Manjaro) it's failing, expecting 10384 bytes to be received, instead of 5232 bytes. If I change from @ to standard packing with =, or force little endian with <, it will expect 5226 bytes, which is the standard size without padding for this structure.

I can't find an explanation other than, in Linux, unsigned long (L) will consume 8 bytes (instead of 4, as stated by the official doc), so 4 * 320 * 8 will fill up all that space. But then, how can I force the field type to be one that would only use 4 bytes? Any way to set this up in the interpreter so I can have a fixed behavior on this?


Solution

  • The issue is that Windows keeps the long type at 4 byte on both 32 and 64 bit. Linux and Mac change it to 8 byte on 64 bit platforms.

    Therefore, if the buftype is based on a C/C++ struct and is used to communicate with a C/C++ library, the larger buffer size is correct because that is what the C compiler for that particular platform will use.

    If we are instead talking about a platform-agnostic format, like the over-the-wire protocol for network transmissions or a file format, you have to stick with a 4 byte type. int is 4 byte on all major platforms and if long is 4 byte, then int and long will be bit-by-bit the same value in memory.

    See also for example