Search code examples
webassemblyemscriptenasm.jswasiwasmtime

Why is WebAssembly safe and what is the linear memory model?


(1) I heard WebAssembly's safety by providing a linear memory. I wonder what does this linear memory contain? Does the wasm stack and heap locate in this memory space? If yes, I think the stack of wasm and stack of glue code (e.g., JavaScript, Python, etc.) are separate, right?

(2) I can understand memory safety of wasm by using an import table. In other words, a wasm function cannot call any function outside the linear memory, because it can only use an index to call the imported functions. Besides this, what other safety does wasm provide? Maybe it is from the stack problem above.

(3) It looks like there is also control flow integrity in wasm. That is every function's return address is fixed and cannot be modified inside this function. Is this a correct understanding?


Solution

  • (1) The linear memory is one large array of bytes, and WebAssembly offers load and store instructions to manipulate bytes in this array. There aren't any instructions in WebAssembly that work on native pointers. The Wasm code can only load/store in the linear memory (in the future, any one of the linear memories once multiple memories support is added).

    (2) WebAssembly is essentially a sandbox. The WebAssembly code isn't native CPU code on any platform, we have a specification for how the WebAssembly instructions behave in order to interpret or compile them. You can do only the things the specification says you can.

    There isn't any instruction pointer you can move (there aren't any registers at all in WebAssembly). You can never have a pointer to the stack (put another way, the stack holds WebAssembly values, not bytes, and does not have any address for you to take). You can't trigger system calls or interrupts (there aren't any such instructions in WebAssembly). You can't mutate or even read your own code, and you can't JIT and produce new code. The Wasm code's access to the outside world is limited to exactly the interface the host embedder gives it.

    WebAssembly does not protect against problems inside the sandbox. If you have C++ code that executed undefined behaviour, it can do any possible thing it likes inside that sandbox, and poking the interface to the host in any possible way, but the full C++ UB experience does not transfer to the embedder. In that way, it can be thought of like a mini-process in your process, or as a container for a library.

    (3) There aren't any instructions in WebAssembly that would let you read/write the function return stack, and therefore it is impossible to modify.