Search code examples
c++constructorlanguage-lawyerstatic-cast

Can you static_cast "this" to a derived class in a base class constructor then use the result later?


We ran into this scenario in our codebase at my work, and we had a big debate over whether this is valid C++ or not. Here is the simplest code example I could come up with:

template <class T>
class A {
public:
    A() { subclass = static_cast<T*>(this); }
    virtual void Foo() = 0;
protected:
    T* subclass;
};

class C : public A<C> {
public:
    C(int i) : i(i) { }
    virtual void Foo() { subclass->Bar(); }
    void Bar() { std::cout << "i is " << i << std::endl; }
private:
    int i;
};

int main() {
    C c(5);
    c.Foo();
    return 0;
}

This code works 100% of the time in practice (as long as the template parameter type matches the subclass type), but if we run it through a runtime analyzer, it tells us that the static_cast is invalid because we're casting this to a C* but the C constructor hasn't run yet. Sure enough, if we change the static_cast to a dynamic_cast, it returns nullptr and this program will fail and crash when accessing i in Bar().

My intuition is that it should always be possible to replace static_cast with dynamic_cast without breaking your code, suggesting that the original code in fact is depending on compiler-specific undefined behavior. However, on cppreference it says:

If the object expression refers or points to is actually a base class subobject of an object of type D, the result refers to the enclosing object of type D.

The question being, is it a base class subobject of an object of type D before the object of type D has finished being constructed? Or is this undefined behavior? My level of C++ rules lawyering is not strong enough to work this out.


Solution

  • In my opinion this is well-defined according to the current wording of the standard: the C object exists at the time of the static_cast, although it is under construction and its lifetime has not yet begun. This would seem to make the static_cast well-defined according to [expr.static.cast]/11, which reads in part:

    ... If the prvalue of type “pointer to cv1 B” points to a B that is actually a base class subobject of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the behavior is undefined.

    It doesn't say that the D object's lifetime must have begun.

    We might also want to look at the explicit rule about when it becomes legal to perform an implicit conversion from derived to base, [class.cdtor]/3:

    To explicitly or implicitly convert a pointer (a glvalue) referring to an object of class X to a pointer (reference) to a direct or indirect base class B of X, the construction of X and the construction of all of its direct or indirect bases that directly or indirectly derive from B shall have started and the destruction of these classes shall not have completed, otherwise the conversion results in undefined behavior. To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.

    According to this rule, as soon as the compiler starts constructing the base class A<C>, it is well-defined to implicitly convert from C* to A<C>*. Before that point, it results in UB. The reason for this, basically, has to do with virtual base classes: if the path by which A<C> is inherited by C contains any virtual inheritance, the conversion may rely on data that are set up by one of the constructors in the chain. For a conversion from base to derived, if there is indeed any virtual inheritance on the chain, static_cast will not compile, so we don't really need to ask ourselves the question, but are those data sufficient for going the other way?

    I really can't see anything in the text of the standard, nor any rationale, for the static_cast in your example not being well-defined, nor in any other case of static_casting from base to derived when the reverse implicit conversion (or static_cast) would be allowed (excepting the case of virtual inheritance, which as I said before, leads to a compile error anyway).

    (Would it be well-defined to do it even earlier? In most cases this won't be possible; how could you possibly attempt to static_cast from B* to D* before the conversion from D* to B* is allowed, without having obtained the B* pointer precisely by doing the latter? If the answer is that you got from D* to B* through an intermediate base class C1 whose constructor has started, but there is another intermediate base class C2 sharing the same B base class subobject and its construction hasn't started yet, then B is a virtual base class, and again, this means the compiler will stop you from then trying to static_cast from B* back down to D*. So I think there are no issues left to resolve here.)