Search code examples
clanguage-lawyerunionstype-punning

Type punning in C using union


Consider the following code:

union u{
    int i;
    long j[2];
};

int main(void){
    union u *u = malloc(sizeof *u);
    u->i = 10;
    printf("%li\n", u->j[0]);
}

I want to explain the legitimacy of the code with 6.5:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

[...]

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

Applying this to the example above we have:

  1. u->i = 10; gives u->i object to have effective type int.
  2. The lvalue u has a union type which contains a member of type int.
  3. The object u->j[0] having unspecified value is accessed using the lvalue u of the union u type having a member of the type int.
  4. Applying the quote 6.5 we have that there is no UB here.

QUESTION: Is such reasoning correct? Or it contains some fault?


Solution

  • Yes your reasonning is correct. This is not undefined behavior, but unspecified behavior according to C11, section 6.2.6.1/7 :

    When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.

    Section 3.19.3 clarifies what this means:

    unspecified value: valid value of the relevant type where this International Standard imposes no requirements on which value is chosen in any instance

    This is reminded in Annex J: Portability Issues

    J.1 Unspecified behavior
    1 The following are unspecified:
    — ...
    — The value of padding bytes when storing values in structures or unions (6.2.6.1).
    — The values of bytes that correspond to union members other than the one last stored into (6.2.6.1).
    — ...

    Nothing about accessing union members is specified in J2 which is about undefined behavior

    This being said, portability issues can be severe as section 6.2.6.1/6 reminds:

    The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.

    A trap representation is an "object representation that need not represent a value of the object type" (definition), being understood that "fetching a trap representation might perform a trap but is not required to" (footnote). So accessing the inactive value may lead to the interruption of the programme, but if it doesn't, it's just that there is no guarantee about it.