Search code examples
clanguage-lawyerundefined-behaviorstrict-aliasing

Strict aliasing of first struct member through opaque pointer in C


Consider the following ISO C code:

struct S {
    int a;
};

void f() {
    struct S s;
    *(int*)&s = 1;  // 1
    ((struct S*)(struct Opaque*)&s)->a = 2;  // 2
    *(int*)(struct Opaque*)&s = 3;  // 3
}

Does this code contain any undefined behavior?

My own understanding is:

  • Statement 1 does not trigger undefined behavior because of C17 §6.7.2.15 "A pointer to a structure object, suitably converted, points to its initial member and vice versa"
  • Statement 2 does not trigger undefined behavior, despite the casts, because we are accessing the object through its effective type S, therefore it does not violate the strict aliasing rule (C17 §6.5.7)
  • But what about statement 3? Opaque is not defined as having a first member of type int, so it seems that 6.7.2.15 does not apply (the unclear meaning of "suitably converted" tends to muddy the waters here). And the strict aliasing rule does not seem to allow accessing an object of effective type S through an int pointer (though it does allow the reverse). So, is statement 3 undefined behavior or not?

Solution

  • Statement 1 does not trigger undefined behavior because of C17 §6.7.2.15 "A pointer to a structure object, suitably converted, points to its initial member and vice versa"

    That is the only sensible interpretation of the standard indeed - an explicit cast to the type of the first member with the correct qualifiers, if any. As per 6.5 a struct pointer may also alias with a pointer to a member contained in that struct.


    Statement 2 does not trigger undefined behavior, despite the casts, because we are accessing the object through its effective type S

    Not necessarily. C17 6.3.2.3 says:

    A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.

    This is allowed to fail during the (struct Opaque*)&s pointer conversion itself. It's somewhat unlikely to happen in practice, but possible in theory.

    Other than that, (struct S*) ... ->a equals a lvalue access with the same effective type struct S as the object was declared with, so that part is well-defined.


    But what about statement 3?

    Same thing here, the pointer conversions aren't necessarily well-defined.

    Apart from that, it isn't sensible to say that an aggregate cannot be accessed through the element/member type of that aggregate. One big flaw in the strict aliasing rules is that they indeed don't mention what effective type an aggregate (or union) has, nor what happens with qualifiers in terms of effective type. Is int arr[5]; one object of effective type int [5] or 5 objects of effective type int? It isn't specified explicitly so nobody knows. It becomes a "quality of implementation" thing.

    Consider struct S* s = malloc(sizeof *s); s->a = 1;. Or int* a = malloc(int[n]); a[0] = 1;. malloc returns a pointer to an object with no effective type. It won't get one until the first lvalue write access. In both of my examples we access a member/array item of type int. If that means that the type of that memory location is now a scalar int and not an aggregate struct/array, the whole C language comes crumbling down. Because then for example a[1] = 1; would suddenly be an out-of-bounds access of the scalar. Which would be ridiculous.

    The only help we have from the C language here is 7.22.3 which says that the memory returned must be usable as an array (but structs/unions are not mentioned). This rule is not harmonized with the concept of effective type however.

    So the answer is: nobody knows. The language standard is unhelpful and unclear here.