Search code examples
c++language-lawyerundefined-behaviorc++26erroneous-behavior

When is an erroneous value not valid for the object's type?


In [conv.lval] p3.4, the result of lvalue-to-rvalue conversion is described as follows:

Otherwise, the object indicated by the glvalue is read ([defns.access]), and the value contained in the object is the prvalue result. If the result is an erroneous value ([basic.indet]) and the bits in the value representation are not valid for the object's type, the behavior is undefined.

When or why would this situation happen? For example, say we have the following code:

bool x;      // x has erroneous value (since C++26)
bool y = x;  // lvalue-to-rvalue conversion

Can't the compiler always choose the erroneous value of x so that it's valid for bool? If this was undefined behavior, it would defeat the purpose of erroneous behavior.

Keep in mind that P2795R5: Erroneous behaviour for uninitialized reads says:

The proposed change of behaviour has a runtime cost for existing code, since in general additional initialization of memory is now required.

Or in standardese ([basic.indet] p1.2):

otherwise, the bytes have erroneous values, where each value is determined by the implementation independently of the state of the program.

Presumably, this means that the compiler can and should initialize the memory (e.g. by zeroing) of x so that it would have a valid (erroneous) value. After all, erroneous behavior is:

well-defined behavior that the implementation is recommended to diagnose


Solution

  • When or why would this situation happen? For example, say we have the following code:

    bool x;      // x has erroneous value (since C++26)
    bool y = x;  // lvalue-to-rvalue conversion
    

    Can't the compiler always choose the erroneous value of x so that it's valid for bool?

    It could, but it doesn't have to. You already quoted the section in [basic.indet] that takes about bytes having erroneous values - and the erroneous value in the byte representation of x might be 42, which is not valid for bool.

    Same as for:

    T* a;
    T* b = a;
    

    So if such a pattern is chosen as the erroneous value of an uninitialized object (e.g. for bool or a pointer), then such a read still has undefined behavior.