Does std::optional<>::emplace() invalidate references to the inner value?

Consider the following fragment (assume that T is trivially constructible and trivially destructible):

std::optional<T> opt;
opt.emplace();
T& ref = opt.value();
opt.emplace();
// is ref guaranteed to be valid here?

From the definition of std::optional we know that the contained instance is guaranteed to be allocated inside the std::optional container, hence we know that the reference ref will always be referring to the same memory location. Are there circumstances where said reference will not retain validity after the pointed-to object is destroyed and then constructed again?

Solution

C++20 has the following rule, [basic.life]/8:

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object. An object o₁ is transparently replaceable by an object o₂ if:

the storage that o₂ occupies exactly overlays the storage that o₁ occupied, and

o₁ and o₂ are of the same type (ignoring the top-level cv-qualifiers), and

o₁ is not a complete const object, and

neither o₁ nor o₂ is a potentially-overlapping subobject (6.7.2), and

either o₁ and o₂ are both complete objects, or o₁ and o₂ are direct subobjects of objects p₁ and p₂ , respectively, and p₁ is transparently replaceable by p₂.

This suggests that as long as T is not const-qualified, destroying the T inside an std::optional<T> and then recreating it should result in a reference to the old object automatically referring to the new object. As pointed out in the comments section, this is a change from the old behaviour, abolishing a requirement that T must not contain a non-static data member of const-qualified or reference type. (Edit: I previously asserted that the change was made retroactively, as I confused it with a different change in C++20. I am not sure whether the resolution to RU 007 and US 042 as indicated in N4858 were made retroactive, but I suspect the answer is yes, because the change was needed to fix code involving standard library templates that was probably not intended to be broken from C++11 through C++17.)

However, we are making the assumption that the new T object is being created "before the storage which the [old] object occupied is reused or released". If I were writing an "adversarial" implementation of the standard library, I could set it up so that the emplace call reuses the underlying storage prior to creating the new T object. This would prevent the old T object from being transparently replaced by the new one.

How might an implementation "reuse" the storage? Typically, the underlying storage might be declared like this:

union {
    char no_object;
    T object;
};

When the default constructor of optional is called, no_object is initialized (the value does not matter)¹. An emplace() call checks whether there is a T object or not (by checking a flag that is not shown here). If a T object is present, then object.~T() is called. Finally, something similar to construct_at(addressof(object)) is called in order to construct the new T object.

Not that any implementation would ever do this, but you could imagine an implementation that, in between the calls to object.~T() and construct_at(addressof(object)), re-initializes the no_object member. This would be a "reuse" of the storage that was previously occupied by object. This would imply that the requirements of [basic.life]/8 are not met.

Of course, the practical answer to your question is that (1) there is no reason for an implementation to do something like this, and (2) even if an implementation did it, the developers would ensure that your code still behaves as if the T object was transparently replaced. Your code is reasonable under the assumption that the standard library implementation is reasonable, and compiler developers do not like to break code with that property, since doing so would needlessly aggravate their users.

But if a compiler developer were inclined to break your code (based on the argument that the more undefined behaviour there is, the more the compiler can optimize) then they could break your code even without changing the <optional> header file. The user is required to treat the standard library like a "black box" that only guarantees what the standard explicitly guarantees. So under a pedantic reading of the standard, it's unspecified whether or not attempting to access ref after the second emplace call has undefined behaviour. If it's unspecified whether it's UB, then the compiler is allowed to start treating it as UB whenever it wants.

¹ The reason for this is historical; C++17 requires that a constexpr constructor initialize exactly one variant member of a union. This rule was abolished in C++20, so a C++20 implementation could omit the no_object member.