I have a shallow class hierarchy (minimal reproducible code example at the bottom) where I'm using an abstract base class to hold mostly common logic between a bunch of related classes, with a virtual function (modifier()
) that child classes override to influence behaviour of a shared method in the parent (logic()
).
There is no real polymorphism involved in the code i.e. I am not passing the child classes as a parent class pointer anywhere, in fact the base class is not mentioned anywhere other than the declaration of the child classes and to enforce this, the base class is even in an anonymous namespace (static linkage). The child classes are declared final, created as themselves and kept that way, to outside code the hierarchy might as well not exist. I am basically using the inheritance mechanism to just mechanically share most of the code between various related classes, without having to do a lot of copypasting myself.
Given that, I would think these classes would be prime candidates for devirtualisation, but it seems to not kick in in current compilers like gcc 13.2 or clang 17.0. The compiler can elide vtables in very very trivial cases, but just a little bit of indirection and vtables are generated and used even at -O3
; see this godbolt:
https://godbolt.org/z/67479h619
Is there any way to make this code more optimisation-friendly and get rid of the vtables more reliably? Does the standard say anything about this optimisation or does it all depend on the compilers and the direction of the wind? Or is the only good way to solve this is "manual devirtualisation" i.e. getting rid of the virtual function entirely by simply copy pasting logic()
into every child class by hand?
Here is the same sample code from the above godbolt:
// devirtualisation test
namespace ns {
namespace {
struct Base {
int common() { return 3; }
virtual int modifier() = 0;
int logic(int n) {
return (n * modifier()) % common();
}
};
} // namespace <anonymous>
struct ChildA final : Base {
int modifier() { return 2; }
};
struct ChildB final : Base {
int modifier() { return 3; }
};
struct ChildC final : Base {
int modifier() { return 4; }
};
} //namespace ns
using namespace ns;
#include <iostream>
using namespace std;
// get* functions to confuse compilers a little
ChildA getA() {
return ChildA();
}
ChildB getB() {
return ChildB();
}
int main() {
cout << getA().logic(2) << "\n";
cout << getB().logic(2) << "\n";
cout << ChildC().logic(2) << "\n";
}
Obviously the real example is a bit more sophisticated; it's a bunch of classes for iteratively going through nodes of a tree, where the virtual function is responsible for deciding when the walk stops and returns a value. So I can create an iterator for an inorder, preorder, leaf-only, etc. walks just by overriding that method. Here's a godbolt if that seems like a useful test (the counterparts for logic()
and modifier()
are bstiterbase<T>::walk()
and bstiterbase<T>::stop()
respectively):
First of all, your Base
being defined in an anonymous namespace means that trying to define any of your derived classes in another translation unit would be an ODR violation (because Base
would refer to a different class in a different, new, anonymous namespace).
Secondly, calling a function on a prvalue will always be devirtualised. So will calling a function on a glvalue which is typed as a final
class. The first example you've given is completely devirtualised (i.e., there is no indirection into any vtable). The vtable can't be removed because it can be used in other translation units.
Third, in your bigger "real example", to devirtualise the call to stop
, the compiler must entirely inline step()
and walk()
. So the tradeoff is the extra code size needed to do this vs the speed lost by just calling the virtual function.
If you want to force the compiler to do this, you can use the CRTP:
// *Not* in an anonymous namespace
namespace detail {
template<typename Self>
struct Base {
Self& self() { return static_cast<Self&>(*this); }
int common() { return 3; }
int logic(int n) {
return (n * self().modifier()) % common();
}
};
}
struct ChildA final : detail::Base<ChildA> {
int modifier() { return 2; }
};
struct ChildB final : detail::Base<ChildB> {
int modifier() { return 3; }
};
struct ChildC final : detail::Base<ChildC> {
int modifier() { return 4; }
};
// Or if you have C++23
namespace detail {
struct Base {
int common() { return 3; }
int logic(this auto&& self, int n) {
return (n * self.modifier()) % self.common();
}
};
}
struct ChildA final : detail::Base {
int modifier() { return 2; }
};
struct ChildB final : detail::Base {
int modifier() { return 3; }
};
struct ChildC final : detail::Base {
int modifier() { return 4; }
};