I have a custom JS iterator implementation and code for measuring performance of the latter implementation:
const ITERATION_END = Symbol('ITERATION_END');
const arrayIterator = (array) => {
let index = 0;
return {
hasValue: true,
next() {
if (index >= array.length) {
this.hasValue = false;
return ITERATION_END;
}
return array[index++];
},
};
};
const customIterator = (valueGetter) => {
return {
hasValue: true,
next() {
const nextValue = valueGetter();
if (nextValue === ITERATION_END) {
this.hasValue = false;
return ITERATION_END;
}
return nextValue;
},
};
};
const map = (iterator, selector) => customIterator(() => {
const value = iterator.next();
return value === ITERATION_END ? value : selector(value);
});
const filter = (iterator, predicate) => customIterator(() => {
if (!iterator.hasValue) {
return ITERATION_END;
}
let currentValue = iterator.next();
while (iterator.hasValue && currentValue !== ITERATION_END && !predicate(currentValue)) {
currentValue = iterator.next();
}
return currentValue;
});
const toArray = (iterator) => {
const array = [];
while (iterator.hasValue) {
const value = iterator.next();
if (value !== ITERATION_END) {
array.push(value);
}
}
return array;
};
const test = (fn, iterations) => {
const times = [];
for (let i = 0; i < iterations; i++) {
const start = performance.now();
fn();
times.push(performance.now() - start);
}
console.log(times);
console.log(times.reduce((sum, x) => sum + x, 0) / times.length);
}
const createData = () => Array.from({ length: 9000000 }, (_, i) => i + 1);
const testIterator = (data) => () => toArray(map(filter(arrayIterator(data), x => x % 2 === 0), x => x * 2))
test(testIterator(createData()), 10);
The output of the test function is very weird and unexpected - the first test run is constantly executed two times faster than all the other runs. One of the results, where the array contains all execution times and the number is the mean (I ran it on Node):
[
147.9088459983468,
396.3472499996424,
374.82447600364685,
367.74555300176144,
363.6300039961934,
362.44370299577713,
363.8418449983001,
390.86111199855804,
360.23125199973583,
358.4788999930024
]
348.6312940984964
Similar results can be observed using Deno runtime, however I could not reproduce this behaviour on other JS engines. What can be the reason behind it on the V8?
Environment: Node v13.8.0, V8 v7.9.317.25-node.28, Deno v1.3.3, V8 v8.6.334
(V8 developer here.) In short: it's inlining, or lack thereof, as decided by engine heuristics.
For an optimizing compiler, inlining a called function can have significant benefits (e.g.: avoids the call overhead, sometimes makes constant folding possible, or elimination of duplicate computations, sometimes even creates new opportunities for additional inlining), but comes at a cost: it makes the compilation itself slower, and it increases the risk of having to throw away the optimized code ("deoptimize") later due to some assumption that turns out not to hold. Inlining nothing would waste performance, inlining everything would waste performance, inlining exactly the right functions would require being able to predict the future behavior of the program, which is obviously impossible. So compilers use heuristics.
V8's optimizing compiler currently has a heuristic to inline functions only if it was always the same function that was called at a particular place. In this case, that's the case for the first iterations. Subsequent iterations then create new closures as callbacks, which from V8's point of view are new functions, so they don't get inlined. (V8 actually knows some advanced tricks that allow it to de-duplicate function instances coming from the same source in some cases and inline them anyway; but in this case those are not applicable [I'm not sure why]).
So in the first iteration, everything (including x => x % 2 === 0
and x => x * 2
) gets inlined into toArray
. From the second iteration onwards, that's no longer the case, and instead the generated code performs actual function calls.
That's probably fine; I would guess that in most real applications, the difference is barely measurable. (Reduced test cases tend to make such differences stand out more; but changing the design of a larger app based on observations made on a small test is often not the most impactful way to spend your time, and at worst can make things worse.)
Also, hand-optimizing code for engines/compilers is a difficult balance. I would generally recommend not to do that (because engines improve over time, and it really is their job to make your code fast); on the other hand, there clearly is more efficient code and less efficient code, and for maximum overall efficiency, everyone involved needs to do their part, i.e. you might as well make the engine's job simpler when you can.
If you do want to fine-tune performance of this, you can do so by separating code and data, thereby making sure that always the same functions get called. For example like this modified version of your code:
const ITERATION_END = Symbol('ITERATION_END');
class ArrayIterator {
constructor(array) {
this.array = array;
this.index = 0;
}
next() {
if (this.index >= this.array.length) return ITERATION_END;
return this.array[this.index++];
}
}
function arrayIterator(array) {
return new ArrayIterator(array);
}
class MapIterator {
constructor(source, modifier) {
this.source = source;
this.modifier = modifier;
}
next() {
const value = this.source.next();
return value === ITERATION_END ? value : this.modifier(value);
}
}
function map(iterator, selector) {
return new MapIterator(iterator, selector);
}
class FilterIterator {
constructor(source, predicate) {
this.source = source;
this.predicate = predicate;
}
next() {
let value = this.source.next();
while (value !== ITERATION_END && !this.predicate(value)) {
value = this.source.next();
}
return value;
}
}
function filter(iterator, predicate) {
return new FilterIterator(iterator, predicate);
}
function toArray(iterator) {
const array = [];
let value;
while ((value = iterator.next()) !== ITERATION_END) {
array.push(value);
}
return array;
}
function test(fn, iterations) {
for (let i = 0; i < iterations; i++) {
const start = performance.now();
fn();
console.log(performance.now() - start);
}
}
function createData() {
return Array.from({ length: 9000000 }, (_, i) => i + 1);
};
function even(x) { return x % 2 === 0; }
function double(x) { return x * 2; }
function testIterator(data) {
return function main() {
return toArray(map(filter(arrayIterator(data), even), double));
};
}
test(testIterator(createData()), 10);
Observe how there are no more dynamically created functions on the hot path, and the "public interface" (i.e. the way arrayIterator
, map
, filter
, and toArray
compose) is exactly the same as before, only under-the-hood details have changed. A benefit of giving all functions names is that you get more useful profiling output ;-)
Astute readers will notice that this modification only shifts the issue away: if you have several places in your code that call map
and filter
with different modifiers/predicates, then the inlineability issue will come up again. As I said above: microbenchmarks tend to be misleading, as real apps typically have different behavior...
(FWIW, this is pretty much the same effect as at Why is the execution time of this function call changing? .)