Search code examples
c++language-lawyerstandardsundefined-behaviortrivially-copyable

Why (if that is the case) does the standard say that copying uninitialized memory with memcpy is UB?


When a class member cannot have a sensible meaning at the moment of construction, I don't initialize it. Obviously that only applies to POD types, you cannot NOT initialize an object with constructors.

The advantage of that, apart from saving CPU cycles initializing something to a value that has no meaning, is that I can detect erroneous usage of these variables with valgrind; which is not possible when I'd just give those variables some random value.

For example,

struct MathProblem {
  bool finished;
  double answer;

  MathProblem() : finished(false) { }
};

Until the math problem is solved (finished) there is no answer. It makes no sense to initialize answer in advance (to -say- zero) because that might not be the answer. answer only has a meaning after finished was set to true.

Usage of answer before it is initialized is therefore an error and perfectly OK to be UB.

However, a trivial copy of answer before it is initialized is currently ALSO UB (if I understand the standard correctly), and that doesn't make sense: the default copy and move constructor should simply be able to make a trivial copy (aka, as-if using memcpy), initialized or not: I might want to move this object into a container:

v.push_back(MathProblem());

and then work with the copy inside the container.

Is moving an object with an uninitialized, trivially copyable member indeed defined as UB by the standard? And if so, why? It doesn't seem to make sense.


Solution

  • Is moving an object with an uninitialized, trivially copyable member indeed defined as UB by the standard?

    Depends on the type of the member. Standard says:

    [basic.indet]

    When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced ([expr.ass]).

    If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:

    • If an indeterminate value of unsigned ordinary character type ([basic.fundamental]) or std​::​byte type ([cstddef.syn]) is produced by the evaluation of:

      • the second or third operand of a conditional expression,
      • the right operand of a comma expression,
      • the operand of a cast or conversion ([conv.integral], [expr.type.conv], [expr.static.cast], [expr.cast]) to an unsigned ordinary character type or std​::​byte type ([cstddef.syn]), or
      • a discarded-value expression,

      then the result of the operation is an indeterminate value.

    • If an indeterminate value of unsigned ordinary character type or std​::​byte type is produced by the evaluation of the right operand of a simple assignment operator ([expr.ass]) whose first operand is an lvalue of unsigned ordinary character type or std​::​byte type, an indeterminate value replaces the value of the object referred to by the left operand.

    • If an indeterminate value of unsigned ordinary character type is produced by the evaluation of the initialization expression when initializing an object of unsigned ordinary character type, that object is initialized to an indeterminate value. If an indeterminate value of unsigned ordinary character type or std​::​byte type is produced by the evaluation of the initialization expression when initializing an object of std​::​byte type, that object is initialized to an indeterminate value.

    None of the exceptional cases apply to your example object, so UB applies.


    with memcpy is UB?

    It is not. std::memcpy interprets the object as an array of bytes, in which exceptional case there is no UB. You still have UB if you attempt to read the indeterminate copy (unless the exceptions above apply).


    why?

    The C++ standard doesn't include a rationale for most rules. This particular rule has existed since the first standard. It is slightly stricter than the related C rule which is about trap representations. To my understanding, there is no established convention for trap handling, and the authors didn't wish to restrict implementations by specifying it, and instead opted to specify it as UB. This also has the effect of allowing optimiser to deduce that indeterminate values will never be read.


    I might want to move this object into a container:

    Moving an uninitialised object into a container is typically a logic error. It is unclear why you might want to do such thing.