Search code examples
c++c++11language-lawyerreinterpret-caststrict-aliasing

Is this use of reinterpret_cast on differently-qualified struct members safe?


I have looked at the following — related — questions, and none of them seem to address my exact issue: one, two, three.

I am writing a collection of which the elements (key-value pairs) are stored along with some bookkeeping information:

struct Element {
    Key key;
    Value value;
    int flags;
};

std::vector<Element> elements;

(For simplicity, suppose that both Key and Value are standard-layout types. The collection won't be used with any other types anyway.)

In order to support iterator-based access, I've written iterators that override operator-> and operator* to return to the user a pointer and a reference, respectively, to the key-value pair. However, due to the nature of the collection, the user is never allowed to change the returned key. For this reason, I've declared a KeyValuePair structure:

struct KeyValuePair {
    const Key key;
    Value value;
};

And I've implemented operator-> on the iterator like this:

struct iterator {
    size_t index;

    KeyValuePair *operator->() {
        return reinterpret_cast<KeyValuePair *>(&elements[index]);
    }
};

My question is: is this use of reinterpret_cast well-defined, or does it invoke undefined behavior? I have tried to interpret relevant parts of the standard and examined answers to questions about similar issues, however, I failed to draw a definitive conclusion from them, because…:

  • the two struct types share some initial data members (namely, key and value) that only differ in const-qualification;
  • the standard does not explicitly say that T and cv T are layout-compatible, but it doesn't state the converse either; furthermore, it mandates that they should have the same representation and alignment requirements;
  • Two standard-layout class types share a common initial sequence if the first however many members have layout-compatible types;
  • for union types containing members of class type that share a common initial sequence, it is permitted to examine the members of such initial sequence using either of the union members (9.2p18). – there's no similar explicit guarantee made about reinterpret_casted pointers-to-structs sharing a common initial sequence. – it is, however, guaranteed that a pointer-to-struct points to its initial member (9.2p19).

Using merely this information, I found it impossible to deduce whether the Element and KeyValuePair structs share a common initial sequence, or have anything other in common that would justify my reinterpret_cast.

As an aside, if you think using reinterpret_cast for this purpose is inappropriate, and I'm really facing an XY problem and therefore I should simply do something else to achieve my goal, let me know.


Solution

  • My question is: is this use of reinterpret_cast well-defined, or does it invoke undefined behavior?

    reinterpret_cast is the wrong approach here, you're simply violating strict aliasing. It is somewhat perplexing that reinterpret_cast and union diverge here, but the wording is very clear about this scenario.

    You might be better off simply defining a union thusly:

    union elem_t {
       Element e{}; KeyValuePair p;
       /* special member functions defined if necessary */
    };
    

    … and using that as your vector element type. Note that cv-qualification is ignored when determining layout-compability - [basic.types]/11:

    Two types cv1 T1 and cv2 T2 are layout-compatible types if T1 and T2 are the same type, […]

    Hence Element and KeyValuePair do indeed share a common initial sequence, and accessing the corresponding members of p, provided e is alive, is well-defined.


    Another approach: Define

    struct KeyValuePair {
        Key key;
        mutable Value value;
    };
    
    struct Element : KeyValuePair {
        int flags;
    };
    

    Now provide an iterator that simply wraps a const_iterator from the vector and upcasts the references/pointers to be exposed. key won't be modifiable, but value will be.