Search code examples
c++cgccx86undefined-behavior

Why does integer overflow on x86 with GCC cause an infinite loop?


The following code goes into an infinite loop on GCC:

#include <iostream>
using namespace std;

int main(){
    int i = 0x10000000;

    int c = 0;
    do{
        c++;
        i += i;
        cout << i << endl;
    }while (i > 0);

    cout << c << endl;
    return 0;
}

So here's the deal: Signed integer overflow is technically undefined behavior. But GCC on x86 implements integer arithmetic using x86 integer instructions - which wrap on overflow.

Therefore, I would have expected it to wrap on overflow - despite the fact that it is undefined behavior. But that's clearly not the case. So what did I miss?

I compiled this using:

~/Desktop$ g++ main.cpp -O2

GCC Output:

~/Desktop$ ./a.out
536870912
1073741824
-2147483648
0
0
0

... (infinite loop)

With optimizations disabled, there is no infinite loop and the output is correct. Visual Studio also correctly compiles this and gives the following result:

Correct Output:

~/Desktop$ g++ main.cpp
~/Desktop$ ./a.out
536870912
1073741824
-2147483648
3

Here are some other variations:

i *= 2;   //  Also fails and goes into infinite loop.
i <<= 1;  //  This seems okay. It does not enter infinite loop.

Here's all the relevant version information:

~/Desktop$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ..

...

Thread model: posix
gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) 
~/Desktop$ 

So the question is: Is this a bug in GCC? Or did I misunderstand something about how GCC handles integer arithmetic?

*I'm tagging this C as well, because I assume this bug will reproduce in C. (I haven't verified it yet.)

EDIT:

Here's the assembly of the loop: (if I recognized it properly)

.L5:
addl    %ebp, %ebp
movl    $_ZSt4cout, %edi
movl    %ebp, %esi
.cfi_offset 3, -40
call    _ZNSolsEi
movq    %rax, %rbx
movq    (%rax), %rax
movq    -24(%rax), %rax
movq    240(%rbx,%rax), %r13
testq   %r13, %r13
je  .L10
cmpb    $0, 56(%r13)
je  .L3
movzbl  67(%r13), %eax
.L4:
movsbl  %al, %esi
movq    %rbx, %rdi
addl    $1, %r12d
call    _ZNSo3putEc
movq    %rax, %rdi
call    _ZNSo5flushEv
cmpl    $3, %r12d
jne .L5

Solution

  • When the standard says it's undefined behavior, it means it. Anything can happen. "Anything" includes "usually integers wrap around, but on occasion weird stuff happens".

    Yes, on x86 CPUs, integers usually wrap the way you expect. This is one of those exceptions. The compiler assumes you won't cause undefined behavior, and optimizes away the loop test. If you really want wraparound, pass -fwrapv to g++ or gcc when compiling; this gives you well-defined (twos-complement) overflow semantics, but can hurt performance.