Search code examples
llvmundefined-behaviorllvm-irinteger-overflow

Integer overflow trapping with LLVM?


I'm creating a statically compiled programming language, and I'm using LLVM as its backend. I want my language to trap/crash whenever integer overflow occurs.

I'm aware of things like llvm.sadd.with.overflow, but I don't think that's an optimal/efficient solution. That function returns a struct of two values, instead of just giving me direct access to the OF register flag. Ideally, after each arithmetic operation I would just have a "JO" assembly instruction to trap whenever integer overflow occurs. This is exactly what clang's UndefinedBehaviorSanitizer does. However, I'm compiling to LLVM IR, not C or C++.

How can I use the UndefinedBehaviorSanitizer (or accomplish something equivalent) to handle integer overflow, directly within LLVM IR?


Solution

  • I'm aware of things like llvm.sadd.with.overflow, but I don't think that's an optimal/efficient solution. [...] Ideally, after each arithmetic operation I would just have a "JO" assembly instruction to trap whenever integer overflow occurs. This is exactly what clang's UndefinedBehaviorSanitizer does.

    What the UndefinedBehaviorSanitizer does is to generate calls to llvm.sadd.with.overflow. You can easily verify this by compiling the following C program with -fsanitize=undefined and looking at the generated LLVM code:

    bla.c:

    #include <stdio.h>
    
    int main(void){
      int x;
      scanf("%d", &x);
      printf("%d\n", x+1);
      return 0;
    }
    

    Command line:

    clang -fsanitize=undefined -emit-llvm -O2 -S bla.c
    

    bla.ll (excerpt):

      %5 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %4, i32 1), !nosanitize !8
      %6 = extractvalue { i32, i1 } %5, 0, !nosanitize !8
      %7 = extractvalue { i32, i1 } %5, 1, !nosanitize !8
      br i1 %7, label %8, label %10, !prof !9, !nosanitize !8
    
    ; <label>:8:                                      ; preds = %0
      %9 = zext i32 %4 to i64, !nosanitize !8
      call void @__ubsan_handle_add_overflow(i8* bitcast ({ { [6 x i8]*, i32, i32 }, { i16, i16, [6 x i8] }* }* @1 to i8*), i64 %9, i64 1) #5, !nosanitize !8
    

    sadd.with.overflow will end up as a regular incl instruction¹ and the br i1 %7 as a jo in the generated x64 assembly, so that's exactly what you want.


    ¹ It would be a proper add instruction if I added something other than 1 in the C code, of course.