Apologies for the broad question! I'm learning WASM and have created a Mandelbrot algorithm in C:
int iterateEquation(float x0, float y0, int maxiterations) {
float a = 0, b = 0, rx = 0, ry = 0;
int iterations = 0;
while (iterations < maxiterations && (rx * rx + ry * ry <= 4.0)) {
rx = a * a - b * b + x0;
ry = 2.0 * a * b + y0;
a = rx;
b = ry;
iterations++;
}
return iterations;
}
void mandelbrot(int *buf, float width, float height) {
for(float x = 0.0; x < width; x++) {
for(float y = 0.0; y < height; y++) {
// map to mandelbrot coordinates
float cx = (x - 150.0) / 100.0;
float cy = (y - 75.0) / 100.0;
int iterations = iterateEquation(cx, cy, 1000);
int loc = ((x + y * width) * 4);
// set the red and alpha components
*(buf + loc) = iterations > 100 ? 255 : 0;
*(buf + (loc+3)) = 255;
}
}
}
I'm compiling to WASM as follows (filename input / output omitted for clarity)
clang -emit-llvm -O3 --target=wasm32 ...
llc -march=wasm32 -filetype=asm ...
s2wasm --initial-memory 6553600 ...
wat2wasm ...
I'm loading in JavaScript, compiling, then invoking as follows:
instance.exports.mandelbrot(0, 300, 150)
The output is being copied to a canvas, which enables me to verify that it is executed correctly. On my computer the above function takes around 120ms to execute.
However, here's a JavaScript equivalent:
const iterateEquation = (x0, y0, maxiterations) => {
let a = 0, b = 0, rx = 0, ry = 0;
let iterations = 0;
while (iterations < maxiterations && (rx * rx + ry * ry <= 4)) {
rx = a * a - b * b + x0;
ry = 2 * a * b + y0;
a = rx;
b = ry;
iterations++;
}
return iterations;
}
const mandelbrot = (data) => {
for (var x = 0; x < 300; x++) {
for (var y = 0; y < 150; y++) {
const cx = (x - 150) / 100;
const cy = (y - 75) / 100;
const res = iterateEquation(cx, cy, 1000);
const idx = (x + y * 300) * 4;
data[idx] = res > 100 ? 255 : 0;
data[idx+3] = 255;
}
}
}
Which only takes ~62ms to execute.
Now I know WebAssembly is very new, and is not terribly optimised. But I can't help feeling that it should be faster than this!
Can anyone spot something obvious I might have missed?
Also, my C code writes directly to memory starting at '0' - I am wondering if this is safe? Where is the stack stored in the paged linear memory? Am I going to risk overwriting it?
Here's a fiddle to illustrate:
https://wasdk.github.io/WasmFiddle/?jvoh5
When run, it logs the timings of the two equivalent implementations (WASM then JavaScript)
Usually you can hope to get ~10% boost on heavy math, compared to optimized JS. That consists of:
Note, Uint8Array copy is notably slow in chrome (ok in FF). When you work with rgba data, it's better to recast underlying buffers to Uint32Array ant use .set()
on it.
Attempt to read/write pixels by word (rgba) in wasm works with the same speed as read/write bytes (r, g, b, a). I did not found difference.
When use node.js
for development (as i do), it worth to stay on 8.2.1 for JS benchmarks. Next version upgraded v8 to v6.0 and introduced serious speed regressions for such math. For 8.2.1 - don't use modern ES6 features like const
, =>
and so on. Use ES5 instead. May be next version with v8 v6.2 will fix those issues.
wasm-opt -O3
, that may help sometime after clang -O3
.s2wasm --import-memory
instead of hardcoding fixed memory sizebenchmark.js
, that's more precise.In short: prior to continue, it worth to cleanup things.
You may find useful to dig https://github.com/nodeca/multimath sources, or use it in your experiments. I created it specially for small CPU intensive things, to simplify issues with proper modules init, memory management, js fallbacks and so on. It contains 'unsharp mask' implementation as example and benchmarks. It should not be difficult to adopt your code there.