Search code examples
c++llvmstrict-aliasing

clang/gcc assume that a pointer into a member may alias another member of a different type?


It looks like in the following example the compiler assumes that the pointer to double passed to bar() may alias the integer member a:

struct A {
    int a;
    double b;
};

void bar(double*);

int foo() {
    A a;
    a.a = 9;
    bar(&a.b);
    return a.a;
}

clang generates:

        stp     x29, x30, [sp, #16]
        add     x29, sp, #16
        mov     w8, #9
        str     w8, [sp]
        mov     x8, sp              # saving the variable
        add     x0, x8, #8
        bl      bar(double*)
        ldr     w0, [sp]            # the reload
        ldp     x29, x30, [sp, #16]
        add     sp, sp, #32
        ret

(godbolt)

This doesn't happen, obviously, if the integer variable is defined outside of the class (godbolt). So to me it seems like both gcc and clang assume that foo(double*) can clobber A::a through A::b, which feels weird since double* cannot alias int*. Is the intuition correct and if so, why?


Solution

  • The compiler considers that bar may do something like

    void bar(double* d) {
        *std::launder(
            reinterpret_cast<int*>(
                reinterpret_cast<unsigned char*>(d) - offsetof(A, b))) = 0;
    }
    

    which effectively changes the value of a.a.

    Now this isn't actually defined by the standard. There is currently an underspecification of what reinterpret_cast<unsigned char*>(d) exactly should mean in the first place, but even if assuming that it results in a pointer into the object representation of *d, then subtracting from it should probably be UB because it is effectively the same as subtracting from a pointer to the initial element of an array.

    The intention of the standard as currently written seems to be that the compiler should be able to assume that a.a is unreachable from d. std::launder has preconditions to that effect. However, there are also other language features that do not seem to properly integrate with the reachability requirements that std::launder has (see previous questions of mine).

    However, in practice this is a common pattern being used e.g. for the container_of macro, so even if it was allowed for the compiler to assume that the call to bar can't change a.a, I would still expect compilers to not make that assumption in practice, as it would break these traditional methods. (At least for standard-layout classes I would say.)

    As far as I know (might be wrong) this container_of approach is also well-defined in C, so even if C++ permitted the optimization, the need to be compatible with mixed C code would probably restrict its use, again at least for sufficiently POD types for which the compatibility with C is intended.

    In practice that means that any pointer into any member/base of a class type object escaping means that the whole class object may be modified.