Consistency of undefined behavior for a fixed compiler

I wonder how the compilers deal with undefined behavior. I will take GCC 10.4 for x86 architecture and -O2 -std=c++03 flags as an example, but please feel free to comment on other compilers. What does it take to alter the outcome of an operation with UB? The language standard does not prescribe what should happen if an operation has UB, but compiler will do something. That is, I'm not asking what happens in UB from C++'s perspective but from compiler's perspective. I know the c++ standard does not impose any restriction on the behavior of the program.

For example, if I have UB due to the value of the object in a memory location being modified more than once by the evaluation of an expression, like so:

int i = 0;
i = ++i + i++; // UB pre-C++11

the chosen compiler in this setup generates the assembly code that reduces the computation to a constant, 3 in this case, see https://godbolt.org/z/MEEGT15dM.

What can cause the constant to become anything rather than 3 if I do not change the compiler, its version, flags or architecture? Could editing the function without changing the value of i before the erroneous statement cause it?

Solution

The C and C++ language standards define “undefined behavior” to be behavior for which the standard imposes no requirements. Note the emphasized part. In particular, this does not mean there are no requirements for the behavior as a whole, but only from the language standard's perspective. There may be requirements from other specifications that the compiler seeks to conform to, including its own.

Compilers commonly support many things that are “undefined behavior” in the sense of a language standard. A few examples are (with some GCC manual links because you asked about gcc):

linking code written in multiple programming languages,
calling operating system routines that display graphics or perform network communication or perform other operating system services,
providing features for variable attributes, such as special alignment requests in a way that predates alignas / _Alignas becoming part of ISO C++ and ISO C,
allowing insertion of assembly language into C or C++ code,
providing routines or operations to count bits in a word, to find the first bit set, to perform arithmetic with overflow handling, (e.g. GNU C __builtin_popcount and __builtin_add_overflow)
providing support for SIMD features, and
defining functions inside functions.

Some compilers define the behavior of certain things the C or C++ standards leave undefined or have options to define the behavior. For example MSVC and gcc -fno-strict-aliasing define the behavior of * (uint32_t) &my_float, but GCC does not by default.

GCC also has command-line options to define the behavior of signed-integer overflow as two’s complement wrap-around (-fwrapv) or as trapping (-ftrapv), or to define the behavior of some things that are otherwise undefined, such as as printing a warning message or printing an error message and aborting (-fsanitize=undefined). (Normally being undefined behavior does not rule out “happens to work’ behavior that makes unit-testing insufficient to verify correctness in other contexts.)

Anything a compiler supports should be stable; it should not be affected by changing optimization switches, language-variant-selection switches, or other switches except as documented by the compiler. So these “undefined behaviors” should be consistent.

Outside of these, there are things that are neither defined by the applicable language standard nor by the compiler (directly in its own documentation or indirectly through specifications it seeks to conform to). For the most part, you should regard these as not stable. Behaviors that are not at all part of the compiler design may change when optimization switches are changed, when other code is changed, when patterns of memory use or contents of memory are changed, and so on.

Although you generally cannot rely on such behaviors, this does not mean they are without pattern. Compilers are not designed randomly; they have properties that arise out of their design. Experienced programmers may recognize certain symptoms as clues about what is wrong in a program. Even though the behavior is undefined (by the language standard and by the compiler), it nonetheless may fall into a pattern because of how we design software. For example, overrunning a buffer may corrupt data further up (earlier) on the stack. This is not guaranteed to happen; optimization can change what happens when a buffer is overrun, but it is nonetheless a common result. Furthermore, it is a result some people do rely on. Malicious people may seek to exploit buffer overruns to attack programs and steal information or money, to take control of systems, or to crash or otherwise cause denial of service. The behavior they exploit is not random; it is at least partly predictable, and that is what affords them the opportunity to exploit it. So even fully undefined behavior cannot be regarded as random; good programmers must consider the consequences of undefined behavior and seek to mitigate it.

What can cause the constant to become anything rather than 3 if I do not change the compiler, its version, flags or architecture?

For the most part, if you change nothing about a compilation, you should get the same result every time, with a few exceptions. This is because a compiler is a machine; it proceeds mechanically and executes its program mechanically. If the compiler has no bugs, then its behavior should be defined by its source code (even if we, the users, do not know what the definition is), and that means that, given the same input and circumstances, it should produce the same output.

One exception is that compilers might inject date or time information into their output. Similarly, other variations in the execution environment might cause some changes. Another issue is that the output of the compiler is object code, and the object code is not the complete program, so the final program may be influenced by other things. An example is that modern multi-user operating systems commonly use address space layout randomization, so many of the addresses in a program will vary from execution to execution. This is unlikely to affect your i = ++i + i++; example, but it means other bugs resulting in undefined behavior can exhibit some randomness due to the addresses involved.

Once C is compiled to machine code, that machine code has specific behavior for specific inputs. (Which may depend on things like size of environment variables or details of what libraries do, or on values in stack memory.) Unlike the abstract machine of the C standard, the execution model for machine code on most machines does not include much if any unpredictable behavior, and compilers normally avoid generating machine code with unpredictable behavior, even for C code paths that have compile-time-visible undefined behavior.