Search code examples
cstructcastinglanguage-lawyerpacking

Are packed identical structs guaranteed to have the same memory layout?


Say I have two structs: object and widget:

struct object {
    int field;
    void *pointer;
};
struct widget {
    int field;
    void *pointer;
};

And a function:

void consume(struct object *obj)
{
    printf("(%i, %p)\n", obj->field, obj->pointer);
}

I'm aware that if I try and do:

struct widget wgt = {3, NULL};
consume(&wgt);

I would violate the strict aliasing rule, and thus have an undefined behaviour.

As far as I understand, the undefined behaviour results from the fact that the compiler may align the struct fields differently: that is, padding fields to align with address boundaries (but never changing fields order, since the order is guaranteed to be respected by the standard).

But what if the two structs are packed? Will they have the same memory layout? Or, in other words, does the above consume() still have an undefined behaviour (despite the persistent compiler warning)?

Note: I used struct __attribute__((__packed__)) object { ... }; for packing (GCC).


Solution

  • They will most likely have the same layout; that will be part of the compiler's ABI.

    The relevant architecture and/or OS may have a standard ABI that may or may not include a specification for packed. But the compiler will have its own ABI to lay them out in a predictable fashion, although the algorithm may not be written down precisely anywhere except the compiler source code.

    However, that does not mean your code is safe. The strict aliasing rule applies to pointers to different types, whether or not they have the same layout.

    Here is an example that can be compiled with gcc -O2:

    #include <stdio.h>
    
    __attribute__((packed))
    struct object {
        int field;
        void *pointer;
    };
    
    __attribute__((packed))
    struct widget {
        int field;
        void *pointer;
    };
    
    struct widget *some_widget;
    
    __attribute__((noipa)) // prevent inlining which hides the bug
    void consume(struct object *obj) 
    {
        some_widget->field = 42;
        int val = obj->field;
        printf("%i\n", val);
    }
    
    int main(void) {
        struct widget wgt = {3, NULL};
        some_widget = &wgt;
        consume((struct object *)&wgt);
    }
    

    Try on godbolt

    You are probably expecting this code to print 42, because some_widget and obj both point to wgt and thus val = obj->field should read the same int that was written by some_widget->field = 42. But in fact it prints 3. The compiler is allowed to assume that obj and some_widget do not alias, as they have different types; so the write and the read are considered independent and may be reordered.

    On the level of the standard, you are accessing the object wgt, whose effective type is struct widget, through the lvalue *some_widget whose type is struct object. These types are not compatible because they have different tags (widget vs object), and so the behavior is undefined.