I have a computationally heavy function which is called many times in a loop:
function func() { // Some fluff
for(let i = 0; i < 1000; i++) {
i *= 10
i /= 10
}
}
function run() {
for(let i = 0; i < 100000; i++) {
func()
}
}
run()
When I profile this script using Chrome's DevTools, I get this:
run
has a self time of 887ms out of the total time of 1015ms even though the only thing it does is repeatedly call func
.
I would expect func
to have the majority of self time since it's a leaf function.
Why is this?
(V8 developer here.)
the function was automatically inlined after some time when it became "hot".
Correct. As soon as run
is optimized, the optimizer decides to inline func
into it. After that, as far as the profiler is concerned, all the time is spent in run
.
(To verify this, run the snippet in d8
or node
with --trace-turbo-inlining
.)
Side note: getting to optimized code for run
takes a bit longer than usual in this case, because the function never returns to get called again (which is the best time to switch to optimized code). The system waits a bit for that to happen, and when it doesn't happen, run
is eventually "on-stack replaced". This is a typical pattern that occurs frequently in small tests and benchmarks, and rarely in real-world code.
Doesn't this just show that performing 100000 function calls are more expensive than a 1000 iterations of two simple arithmetic operations -- which makes sense?
No, it doesn't show that; that's just one particular way how one could be misled by this microbenchmark.
Personally, I'm mildly disappointed to see (with --print-opt-code
) that the compiler didn't realize that i *= 10; i /= 10;
is a no-op and could be dropped entirely. That would have been another great way to be misled here. Oh well, there's probably some reason why it's harder than it looks for the compiler to figure out that that transformation would be both applicable and safe...