What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p
points to the string literal "hello!\n"
, and the two assignments below try to modify that string literal. What does this program do? According to the C++ standard, [lex.string] note 4, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow
" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include
i++ + ++i
.[intro.defs] also defines undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
implementation-defined behavior [defns.impl.defined]
behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents
unspecified behavior [defns.unspecified]
behavior, for a well-formed program construct and correct data, that depends on the implementation
[Note: The implementation is not required to document which behavior occurs. The range of possible behaviors is usually delineated by this document. — end note]
undefined behavior [defns.undefined]
behavior for which this document imposes no requirements
[Note: Undefined behavior may be expected when this document omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). [...] — end note]
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.