c++compiler-optimization undefined-behavior compiler-options observable-behavior

Is this compiler optimization inconsistency entirely explained by undefined behaviour?

During a discussion I had with a couple of colleagues the other day I threw together a piece of code in C++ to illustrate a memory access violation.

I am currently in the process of slowly returning to C++ after a long spell of almost exclusively using languages with garbage collection and, I guess, my loss of touch shows, since I've been quite puzzled by the behaviour my short program exhibited.

The code in question is as such:

#include <iostream>

using std::cout;
using std::endl;

struct A
{
    int value;
};

void f()
{
    A* pa;    // Uninitialized pointer
    cout<< pa << endl;
    pa->value = 42;    // Writing via an uninitialized pointer
}

int main(int argc, char** argv)
{   
    f();

    cout<< "Returned to main()" << endl;
    return 0;
}

I compiled it with GCC 4.9.2 on Ubuntu 15.04 with -O2 compiler flag set. My expectations when running it were that it would crash when the line, denoted by my comment as "writing via an uninitialized pointer", got executed.

Contrary to my expectations, however, the program ran successfully to the end, producing the following output:

0
Returned to main()

I recompiled the code with a -O0 flag (to disable all optimizations) and ran the program again. This time, the behaviour was as I expected:

0
Segmentation fault

(Well, almost: I didn't expect a pointer to be initialized to 0.) Based on this observation, I presume that when compiling with -O2 set, the fatal instruction got optimized away. This makes sense, since no further code accesses the pa->value after it's set by the offending line, so, presumably, the compiler determined that its removal would not modify the observable behaviour of the program.

I reproduced this several times and every time the program would crash when compiled without optimization and miraculously work, when compiled with -O2.

My hypothesis was further confirmed when I added a line, which outputs the pa->value, to the end of f()'s body:

cout<< pa->value << endl;

Just as expected, with this line in place, the program consistently crashes, regardless of the optimization level, with which it was compiled.

This all makes sense, if my assumptions so far are correct. However, where my understanding breaks somewhat is in case where I move the code from the body of f() directly to main(), like so:

int main(int argc, char** argv)
{   
    A* pa;
    cout<< pa << endl;
    pa->value = 42;
    cout<< pa->value << endl;

    return 0;
}

With optimizations disabled, this program crashes, just as expected. With -O2, however, the program successfully runs to the end and produces the following output:

0
42

And this makes no sense to me.

This answer mentions "dereferencing a pointer that has not yet been definitely initialized", which is exactly what I'm doing, as one of the sources of undefined behaviour in C++.

So, is this difference in the way optimization affects the code in main(), compared to the code in f(), entirely explained by the fact that my program contains UB, and thus compiler is technically free to "go nuts", or is there some fundamental difference, which I don't know of, between the way code in main() is optimized, compared to code in other routines?

Solution

Writing unknown pointers has always been something which could have unknown consequences. What's nastier is a currently-fashionable philosophy which suggests that compilers should assume that programs will never receive inputs that cause UB, and should thus optimize out any code which would test for such inputs if such tests would not prevent UB from occurring.

Thus, for example, given:

uint32_t hey(uint16_t x, uint16_t y)
{
  if (x < 60000)
    launch_missiles();
  else
    return x*y;
}
void wow(uint16_t x)
{
  return hey(x,40000);
}

a 32-bit compiler could legitimately replace wow with an unconditional call to launch_missiles without regard for the value of x, since x "can't possibly" be greater than 53687 (any value beyond that would cause the calculation of x*y to overflow. Even though the authors of C89 noted that the majority of compilers of that era would calculate the correct result in a situation like the above, since the Standard doesn't impose any requirements on compilers, hyper-modern philosophy regards it as "more efficient" for compilers to assume programs will never receive inputs that would necessitate reliance upon such things.