Search code examples
c++staticlanguage-lawyerundefined-behavior

Is using declared and initialized but undefined variables undefined behavior?


In C++ you can have a class C declared in c.h which contains the declaration and initialization of static const int var = 3;.

class C {
public:
...
    static const int var = 3
...
};

Unless you make that variable inline, or explicitly define it in your c.cpp,

...
const int C::var;
...

it is undefined.

  • So, is even using var in not memory address-related things, like a counter in a for loop undefined behavior?
  • How does the compiler use the value of an undefined variable? Does the preprocessor macro-substitute its value?
  • In addition to this case of a static int variable in a class declaration, is there any other circumstance where you can have declared and initialized but undefined variables?
  • Can this happen in the C programming language?

I know there must be some sort of mechanism to make this work as I've seen C++ code which uses undefined static const variables. But I would appreciate citing the standard.


Solution

  • Per [class.static.data]/4 sentence 1 this construct for static data members is allowed as an exception from the general rule only if var has a const-qualified integral or enumeration type and is initialized by a constant expression.

    Per [basic.def.odr]/10 (reaffirmed by [class.static.data]/4 sentence 2) a definition of var is required iff var is odr-used. Otherwise the program is IFNDR (ill-formed, no diagnostic required).

    Generally a variable is always odr-used if it is named in a potentially-evaluated expression outside a discarded statement.

    However, because var is of const integral or enumeration type and initialized with a constant expression, it is usable in constant expressions (and can't have mutable subobjects) and therefore the exception in [basic.def.odr]/4.2 may apply even if var is named in a potentially-evaluated expression. This exception essentially applies if all you do with var in the expression is to immediately read its value (lvalue-to-rvalue conversion), instead of e.g. forming a pointer/reference to it. Because the variable is initialized with a constant expression the compiler can just replace the use of the variable with the compile-time constant value given to var.

    As @LanguageLawyer points out in the comments under this answer, the standard currently says in [intro.object]/1 that definitions can create objects, but not that declarations in general can. This is a problem, because then even if you do not violate ODR per the above exception without a definition for var, you still wouldn't have an object, but the lvalue-to-rvalue conversion is specified in terms of reading the object's value (or depending on interpretation not at all when the ODR exception applies). Absent a defined behavior the program would still have undefined behavior when evaluating the expression. However, this is clearly unintended and a defect in the standard.

    There is another minor exception in [basic.def.odr]/4.3 for when the variable is named essentially immediately as a discarded-value expression, e.g. when writing var; as a statement. In that case neither value nor address of var is needed by the compiler.

    None of this has anything to do with the preprocessor, which works exclusively before even parsing of the actual C++ code.

    This is I think the only exception where you can have an initializing declaration of a variable that is not at the same time a definition of the variable. It exists only for historical reasons from before constexpr and inline could be used on variables to make values of static data members usable in constant expressions.