Search code examples
c++undefined-behaviorsanitizer

Is there a C++ implementation that detects all undefined behavior?


A huge number of operations in C++ result in undefined behavior, where the spec is completely mute about what the program's behavior ought to be and allows for anything to happen. Because of this, there are all sorts of cases where people have code that compiles in debug but not release mode, or that works until a seemingly unrelated change is made, or that works on one machine but not another, etc.

My question is whether there is a utility that looks at the execution of C++ code and flags all instances where the program invokes undefined behavior. While it's nice that we have tools like valgrind and checked STL implementations, these aren't as strong as what I'm thinking about - valgrind can have false negatives if you trash memory that you still have allocated, for example, and checked STL implementations won't catch deleting through a base class pointer.

Does this tool exist? Or would it even be useful to have it lying around at all?

EDIT: I am aware that in general it is undecidable to statically check whether a C++ program may ever execute something that has undefined behavior. However, it is possible to determine whether a specific execution of a C++ produced undefined behavior. One way to do this would be to make a C++ interpreter that steps through the code according to the definitions set out in the spec, at each point determining whether or not the code has undefined behavior. This won't detect undefined behavior that doesn't occur on a particular program execution, but it will find any undefined behavior that actually manifests itself in the program. This is related to how it is Turing-recognizable to determine if a TM accepts some input, even if it's still undecidable in general.

Thanks!


Solution

  • John Regehr in Finding Undefined Behavior Bugs by Finding Dead Code points out a tool called STACK and I quote from the site (emphasis mine):

    Optimization-unstable code (unstable code for short) is an emerging class of software bugs: code that is unexpectedly eliminated by compiler optimizations due to undefined behavior in the program. Unstable code is present in many systems, including the Linux kernel and the Postgres database server. The consequences of unstable code range from incorrect functionality to missing security checks.

    STACK is a static checker that detects unstable code in C/C++ programs. Applying STACK to widely used systems has uncovered 160 new bugs that have been confirmed and fixed by developers.

    Also in C++11 for the case of constexpr variables and functions undefined behavior should be caught at compile time.

    We also have gcc ubsan:

    GCC recently (version 4.9) gained Undefined Behavior Sanitizer (ubsan), a run-time checker for the C and C++ languages. In order to check your program with ubsan, compile and link the program with -fsanitize=undefined option. Such instrumented binaries have to be executed; if ubsan detects any problem, it outputs a “runtime error:” message, and in most cases continues executing the program.

    and Clang Static Analyzer which includes many checks for undefined behavior. For example clangs -fsanitize checks which includes -fsanitize=undefined:

    -fsanitize=undefined: Fast and compatible undefined behavior checker. Enables the undefined behavior checks that have small runtime cost and no impact on address space layout or ABI. This includes all of the checks listed below other than unsigned-integer-overflow.

    and for C we can look at his article It’s Time to Get Serious About Exploiting Undefined Behavior which says:

    [..]I confess to not personally having the gumption necessary for cramming GCC or LLVM through the best available dynamic undefined behavior checkers: KCC and Frama-C.[...]

    Here is a link to kcc and I quote:

    [...]If you try to run a program that is undefined (or one for which we are missing semantics), the program will get stuck. The message should tell you where it got stuck and may give a hint as to why. If you want help deciphering the output, or help understanding why the program is undefined, please send your .kdump file to us.[...]

    and here are a link to Frama-C, an article where the first use of Frama-C as a C interpreter is described and an addendum to the article.