Search code examples
c++language-lawyerc++20unions

Active member of a union created directly through memcpy() or std::bit_cast()?


Consider the following code:

#include <stdint.h>
#include <string.h>
#include <bit>

union U
{
    uint64_t ull;
    int32_t i[2];
};

int main(int argc, char *argv[])
{
    uint64_t x = argc;
    U u;
    if (true)
        u = std::bit_cast<U>(x);
    else
        ::memcpy(&u, &x, sizeof(x));
    auto r = u.i[0];
    r += u.i[1];
    r += u.ull;
    return r;
}

I have 3 questions:

  1. Which member of u is active immediately before to the auto r = ... line?

  2. Which line(s) have undefined behavior (if any), and why?

  3. Does flipping true to false change the answers above, and if so how/why?


Solution

  • Which member of u is active immediately before to the auto r = ... line?

    The default-initialized u does not have any active member.

    std::bit_cast implicitly creates objects nested within the returned object. A member will be active only if std::bit_cast's implicit object creation starts the lifetime of one of the subobjects. Implicit object creation is specified to do this only if that would give the program execution defined behavior.

    However, regardless of which subobject's member implicit object creation would start the lifetime, either r += u.i[1]; or r += u.ull; would have undefined behavior for reading the inactive member.

    Therefore, the program has undefined behavior regardless and no member of u is active.

    Which line(s) have undefined behavior (if any), and why?

    Individual lines cannot have undefined behavior. Either the whole program has undefined behavior (for given input) or it doesn't.

    With all lines present the program has undefined behavior, but removing either those accessing u.ull or those accessing u.i will give the program defined behavior, because implicit object creation of std::bit_cast will make the other member active. (Assuming that copy-assignment of the union is intended to make the same member active in the destination as is active in the source. The specification for that doesn't seem clear to me. See e.g. discussion in https://github.com/cplusplus/draft/issues/5193.)

    Does flipping true to false change the answers above, and if so how/why?

    The same problem remains. memcpy also implicitly creates objects nested in the destination by the same rules.


    Furthermore, depending on the input and implementation-defined representations of the involved types, r += u.i[1]; may cause undefined behavior regardless of the above for overflow on signed integral arithmetic.