Search code examples
ccompiler-optimizationundefined-behaviorinteger-overflow

What optimizations are possible when loop variable is undefined on overflow?


This article features an example snippet of C code which the compiler can optimize thanks to the fact that overflow is undefined for the type of the loop counter.

Here's the snippet with the comment

for (i = 0; i <= N; ++i) { ... }

In this loop, the compiler can assume that the loop will iterate exactly N+1 times if "i" is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses "int" as induction variables.

I have quite a few doubts:

  • I understand that, if the type of i is undefined on overflow, the compiler can assume that i will not wrap around, but why would this imply that the loop runs N+1 times? Can't it be that i is altered in the body of the loop?
  • Even given that assumption, what optimization could it do based on that? If N is not known at compile-time, for instance, the loop can't be unrolled, can it?
  • "Broad range of loop optimizations" makes me think I'm missing a lot, here.

Solution

  • Can't it be that i is altered in the body of the loop?

    No, they mean this:

    size_t N = getSize();
    for (int i = 0; i <= N; ++i) { 
       // i never changed here
    }
    

    Assuming a 64-bit platform, N might be too big for the loop to run correctly. (Perhaps the programmer knows that N is never too big, but the compiler doesn't). But if N is too big, i will overflow, and overflow "cannot" happen. So the compiler can assume that N is never too big after all -- even if it sometimes is.

    By contrast,

    for (unsigned i = 0; i <= N; ++i) {
    

    here, even if N is the maximum representable unsigned, i won't overflow, but rather wrap around, which is completely legal. The compiler no longer can assume that N is never too big, and must deal with a loop that potentially doesn't end.

    what optimization could it do based on that?

    Just watch this in awe.

    If the compiler know the loop is going to end, it is able to use a (presumably) more efficient set of instructions. The machine code becomes much longer and much more complicated, but who cares if it wins us 3.27 picoseconds per run?