devirtualize: to change a virtual/polymorphic/indirect function call into a static function call due to some guarantee that the change is correct -- source: myself
Given a simple trait object, &dyn ToString
, created with a statically known type, String
:
fn main() {
let name: &dyn ToString = &String::from("Steve");
println!("{}", name.to_string());
}
Does the call to .to_string()
use <String as ToString>::to_string()
directly? Or only indirectly via the trait's vtable? If indirectly, would it be possible to devirtualize this call? Or is there something fundamental that hinders this optimization?
The motivating code for this question is much more complicated; it uses async trait functions and I'm wondering if returning a Box<dyn Future>
can be optimized in some cases.
Does Rust devirtualize trait object function calls?
No.
Rust is a language, it doesn't do anything; it only prescribes semantics.
In this specific case, the Rust language doesn't prescribe devirtualization, so an implementation is permitted to do it.
At the moment, the only stable implementation is rustc, with the LLVM backend -- though you can use the cranelift backend if you feel adventurous.
You can test your code for this implementation on the playground and select "Show LLVM IR" instead of "Run", as well as "Release" instead of "Debug", you should be able to check that there is no virtual call.
A revised version of the code isolates the cast to trait + dynamic call to make it easier:
#[inline(never)]
fn to_string(s: &String) -> String {
let name: &dyn ToString = s;
name.to_string()
}
fn main() {
let name = String::from("Steve");
let name = to_string(&name);
println!("{}", name);
}
Which when run on the playground yields among other things:
; playground::to_string
; Function Attrs: noinline nonlazybind uwtable
define internal fastcc void @_ZN10playground9to_string17h4a25abbd46fc29d4E(%"std::string::String"* noalias nocapture dereferenceable(24) %0, %"std::string::String"* noalias readonly align 8 dereferenceable(24) %s) unnamed_addr #0 {
start:
; call <alloc::string::String as core::clone::Clone>::clone
tail call void @"_ZN60_$LT$alloc..string..String$u20$as$u20$core..clone..Clone$GT$5clone17h1e3037d7443348baE"(%"std::string::String"* noalias nocapture nonnull sret dereferenceable(24) %0, %"std::string::String"* noalias nonnull readonly align 8 dereferenceable(24) %s)
ret void
}
Where you can clearly see that the call to ToString::to_string
has been replaced by a simple call to <String as Clone>::clone
; a devirtualized call.
The motivating code for this question is much more complicated; it uses async trait functions and I'm wondering if returning a
Box<dyn Future>
can be optimized in some cases.
Unfortunately, you cannot draw any conclusion from the above example.
Optimizations are finicky. In essence, most optimizations are akin to pattern-matching+replacing using regexes: differences that to human look benign may completely throw off the pattern-matching and prevent the optimization to apply.
The only way to be certain that the optimization is applied in your case, if it matters, is to inspect the emitted assembly.
But, really, in this case, I'd be more worried about the memory allocation than about the virtual call. A virtual call is about 5ns of overhead -- though it does inhibit a number of optimization -- whereas a memory allocation (and the eventual deallocation) routinely cost 20ns - 30ns.