Search code examples
securitycompilationllvm

Can I deliberately compile nondeterminstically?


Address space layout randomization is a decently effective method for defeating exploits against a binary, in that the exploit has to successfully locate the memory addresses it wishes to attack first and can't rely on them being constant. I'm interested in the possibility of taking this further and introducing randomness into the compilation process to change compile-level implementation details, such as putting variables in different registers or in a different order in the stack frame, or doing LLVM passes in a different order to cause functions and constexprs to inline differently, or perhaps even introduce a 1-in-1000 bounds or null check that would otherwise be excluded from a fully optimized build. A different build could be created per user, such that all of my users get their own binary that adheres to the source code but has different side channels and UB, thus highly limiting the effectiveness of any attacks that rely on those things.

Is there any major build toolchain/language that can be configured to do this? If not, is there a way I could simulate something like this in my source code?

(I would also welcome a frame challenge to this idea. For instance, I acknowledge that having a fully reproducible build might be more valuable than anything I might get from giving every user a personalized build. However, I also feel that this idea could be useful in testing, such that devs can be confident that it's their source code, and not the specific version of the compuler they're using, that determines the software's correctness.)


Solution

  • As was mentioned in the comments, it is indeed possible, but the downsides greatly overweight profit:

    • Adding nondeterministic compilation breaks reproducible builds, so users will no longer be able to verify the packages they install

    • users usually install prebuild packages and you certainty don't want to build your package for each user. => this counter-measure becomes useless: attacker will likely have access to the same built, that the users use

    • stack layout and register allocation are subject to a great amount of optimization. An attempt to interfere this process nondeterministically will result in nondeterministic performance -> benchmarks will become useless

    • In theory, users of source-based distributions could take advantage of this security measure, but it will likely result in a number of nondeterministic bugs, which will be impossible to locate automatically

    To sum up: this security measure is a good example of so-called security theater: it will not provide additional security for most of the cases, but will greatly complicate a lot of things.