Search code examples
cstructlanguage-lawyerstrict-aliasing

Strict aliasing and overlay inheritance


Consider this code example:

#include <stdio.h>

typedef struct A A;

struct A {
   int x;
   int y;
};

typedef struct B B;

struct B {
   int x;
   int y;
   int z;
};

int main()
{
    B b = {1,2,3};
    A *ap = (A*)&b;

    *ap = (A){100,200};      //a clear http://port70.net/~nsz/c/c11/n1570.html#6.5p7 violation

    ap->x = 10;  ap->y = 20; //lvalues of types int and int at the right addrresses, ergo correct ?

    printf("%d %d %d\n", b.x, b.y, b.z);
}

I used to think that something like casting B* to A* and using A* to manipulate the B* object was a strict aliasing violation. But then I realized the standard really only requires that:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 1) a type compatible with the effective type of the object, (...)

and expressions such as ap->x do have the correct type and address, and the type of ap shouldn't really matter there (or does it?). This would, in my mind, imply that this type of overlay inheritance is correct as long as the substructure isn't manipulated as a whole.

Is this interpretation flawed or ostensibly at odds with what the authors of the standard intended?


Solution

  • The line with *ap = is a strict aliasing violation: an object of type B is written using an lvalue expression of type A.

    Supposing that line was not present, and we moved onto ap->x = 10; ap->y = 20;. In this case an lvalue of type int is used to write objects of type int.

    There is disagreement about whether this is a strict aliasing violation or not. I think that the letter of the Standard says that it is not, but others (including gcc and clang developers) consider ap->x as implying that *ap was accessed. Most agree that the standard's definition of strict aliasing is too vague and needs improvement.

    Sample code using your struct definitions:

    void f(A* ap, B* bp)
    {
      ap->x = 213;
      ++bp->x;
      ap->x = 213;
      ++bp->x;
    }
    
    int main()
    {
       B b = { 0 };
       f( (A *)&b, &b );
       printf("%d\n", b.x);
    }
    

    For me this outputs 214 at -O2, and 2 at -O3 , with gcc. The generated assembly on godbolt for gcc 6.3 was:

    f:
        movl    (%rsi), %eax
        movl    $213, (%rdi)
        addl    $2, %eax
        movl    %eax, (%rsi)
        ret
    

    which shows that the compiler has rearranged the function to:

    int temp = bp->x + 2;
    ap->x = 213;
    bp->x = temp;
    

    and therefore the compiler must be considering that ap->x may not alias bp->x.