Search code examples
cstructunionspointer-aliasing

Exception to strict aliasing rule in C from 6.5.2.3 Structure and union members


Quote from C99 standard:

6.5.2.3

5 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

There is example for this case:

// The following code is not a valid fragment because
// the union type is not visible within the function f.

struct t1 { int m; };
struct t2 { int m; };

int f(struct t1 *p1, struct t2 *p2)
{
    if (p1->m < 0)
        p2->m = -p2->m;
    return p1->m;
}

int g()
{
    union
    {
        struct t1 s1;
        struct t2 s2;
    } u;

    /* ... */
    return f(&u.s1, &u.s2);
}

I have added few changes:

#include <stdio.h>

struct t1 { int m; };
struct t2 { int m; };

union u
{
    struct t1 s1;
    struct t2 s2;
};

int foo(struct t1 *p1, struct t2 *p2)
{
    if (p1->m)
        p2->m = 2;
    return p1->m;
}

int main(void)
{
    union u u;
    u.s1.m = 1;
    printf("%d\n", foo(&u.s1, &u.s2));
}

As you can see I have moved union declaration outside so it would be visible in foo(). According to the comment from standard, this should have made my code correct but it looks like strict aliasing still breaks this code for clang 3.4 and gcc 4.8.2.

Output with -O0:

2

Output with -O2:

1

for both compilers.

So my question is:

is C really relies on union declaration to decide if some structures are exception to strict aliasing rule? Or both gcc/clang have a bug?

It seems really broken to me, because even if function and union are both declared in the same header, this does not guarantee that the union is visible in translation unit with body of the function.


Solution

  • The most important point is that your change (moving the union up) is not changing the definition of the function foo at all. It is still a function that receives unrelated pointers. In your example the passed pointers are related while elsewhere this might be different. The goal of compiler is to serve the most general case. The body of the function is different after the change and it is not clear why.

    The question that you are asking is about how careful optimization is implemented in your particular compiler for certain command line keys. It has nothing to do with the memory layout. In a correct compiler the result should be the same. Compiler should handle the case when 2 different pointers in fact point to the same place in memory.