Search code examples
c++c++14strict-aliasing

Strict aliasing and union of char arrays


I sort of expect what the answer will be already, but am curious what the standard says about this.

The setup: I want to control the exact offset of fields in a structure, and specify them directly in the field type. Here is the magic:

#include <type_traits>
#include <cstdint>
#include <cstring>
#include <new>

template <uint32_t OFFSET, typename T>
struct FieldOverlay
{
    static_assert(std::is_trivial<T>::value, "Can only be used with trivial types");

    FieldOverlay() = delete;
    FieldOverlay(FieldOverlay&& other) = delete;
    FieldOverlay(FieldOverlay const& other) = delete;

    FieldOverlay& operator = (T const& val) { new (buf + OFFSET) T(val); return *this; }

    operator T()
    {
        T v;
        ::memcpy(&v, buf + OFFSET, sizeof(T));
        return v;
    }
private:
    char buf[OFFSET + sizeof(T)];
};

// Precisely control member offsets
union MyMessage
{
    FieldOverlay<0, uint32_t> x;
    FieldOverlay<7, uint32_t> y;
};

void exampleUsage(MyMessage& m)
{
    m.y = m.x;
}

struct MyMessageEquivalent
{
    uint32_t x;
    char padding[3];
    uint32_t y;
} __attribute__ ((packed));

This compiles on gcc 6.3 with -O3 -std=c++1z -fstrict-aliasing -Wall -Wpedantic -Wextra -Werror without any errors, and works as expected. (See godbolt: https://godbolt.org/g/DHWLD9)

Question: Is this kosher according to the standard? I think it gets very dicey because there's not a single "active" member of the union MyMessage. However, since everything is accessed through char arrays could that help with the strict-aliasing rules?


Solution

  • Your code has lots of problems, but you'll invoke UB long before strict aliasing is even an issue.

    MyMessage::x and MyMessage::y are not layout compatible types. They also do not have a common initial sequence. Yes, despite the fact that they both store an array of char, they do not store the same sized array of char. Those two arrays are of different lengths, and there is nothing in the common initial sequence rules that say that two structs which contain two arrays of the same base type but of different sizes have a common initial sequence.

    Therefore, you cannot attempt to access x while y is the active member of the union. And vice-versa.

    And FYI: your reinterpret casts provoke UB as well. There is no T in that memory, and reinterpret_cast cannot create an object. So accessing that memory as though it contained a T violates the standard. Also, the standard doesn't allow you to access an object through a misaligned pointer.

    So basically, what you're trying to do isn't going to ever work.