Search code examples
pythonc++binarypacking

Python Packing int32 but doesn't work?


I have been at this all day and I can't seem to find a solution :(

I have a header for a file I want to create (I'm parsing an obj file in Python to output into binary data to be loaded up in my C++ game engine).

Here is the C++ definition of the Mesh Header

struct MeshHeader
{
    unsigned short _vertex_size;
    uint32 _vertex_count;
    unsigned short _index_buffer_count;
    short _position_offset;
    unsigned short _position_component_count;
    short _uv_offset;
    unsigned short _uv_component_count;
    short _normal_offset;
    unsigned short _normal_component_count;
    short _tangent_offset;
    unsigned short _tangent_component_count;
    short _binormal_offset;
    unsigned short _binormal_component_count;
    short _colour_offset;
    unsigned short _colour_component_count;
};

Where uint32 is basically a typedef from uint32_t from stdint.h.... So judging from this, the first three member vars are 2 bytes, 4 bytes, 2 bytes respectively, yes?

This is how I read it into the structure

fread(&_header, sizeof(MeshHeader), 1, f);

_vertex_size gets sets correctly to 56, but _vertex_count gets set to 65536. If I change its data type to uint16 (unsigned short) it gets set correctly to 36. But why? I am using the pack("<I") function(knowing my machine is little endian).

This is my packing code in python

    f.write(pack('<H', self._vertex_size))
    f.write(pack('<I', self._vertex_count))
    f.write(pack('<H', self._index_buffer_count))
    f.write(pack('<h', self._position_offset))
    f.write(pack('<H', self._position_component_count))
    f.write(pack('<h', self._uv_offset))
    f.write(pack('<H', self._uv_component_count))
    f.write(pack('<h', self._normal_offset))
    f.write(pack('<H', self._normal_component_count))
    f.write(pack('<h', self._tangent_offset))
    f.write(pack('<H', self._tangent_component_count))
    f.write(pack('<h', self._binormal_offset))
    f.write(pack('<H', self._binormal_component_count))
    f.write(pack('<h', self._colour_offset))
    f.write(pack('<H', self._colour_component_count))

Following the specs of the struct.pack function ( https://docs.python.org/2/library/struct.html )... H is unsigned short (2 bytes) I is unsigned integer (4 bytes) and h is a short (2 bytes) which match exactly what I have specified in my C MeshHeader class, is it not?

I've been pulling out my hair the past few hours (and I don't have much of it left!). Any suggestions on what could be happening?

Here is a snapshot of the header file in Sublime Text 3 btw http://gyazo.com/e15942753819e695617390129e6aa879


Solution

  • As @martineau mentioned, what you're seeing is C struct packing. The compiler adds padding between non-word size members to optimize memory access. You can disable this with certain #pragma directives. For Visual C, the syntax is #pragma pack(1) as explained in MSDN. In the following example I also use push and pop to restore the packing to its previous value. Proper packing is important for efficient memory access, so changing it should only be reserved to structures you write to disk.

    // align on 1-byte
    #pragma pack(push, 1)
    struct MeshHeader
    {
        unsigned short _vertex_size;
        uint32 _vertex_count;
        unsigned short _index_buffer_count;
        short _position_offset;
        // ...
    };
    // restore alignment
    #pragma pack(pop)
    

    Note that even with struct packing avoided, you may still have issues with endianity. Writing structures to disk assumes you have full control and knowledge ahead of time of both writer and reader. You can make your life easier by using a proper serialization library. There are many, but a few examples that support both C and Python are protobuf and Avro.