Search code examples
c++c++11language-lawyermemcpyobject-lifetime

Why would the behavior of std::memcpy be undefined for objects that are not TriviallyCopyable?


From http://en.cppreference.com/w/cpp/string/byte/memcpy:

If the objects are not TriviallyCopyable (e.g. scalars, arrays, C-compatible structs), the behavior is undefined.

At my work, we have used std::memcpy for a long time to bitwise swap objects that are not TriviallyCopyable using:

void swapMemory(Entity* ePtr1, Entity* ePtr2)
{
   static const int size = sizeof(Entity); 
   char swapBuffer[size];

   memcpy(swapBuffer, ePtr1, size);
   memcpy(ePtr1, ePtr2, size);
   memcpy(ePtr2, swapBuffer, size);
}

and never had any issues.

I understand that it is trivial to abuse std::memcpy with non-TriviallyCopyable objects and cause undefined behavior downstream. However, my question:

Why would the behavior of std::memcpy itself be undefined when used with non-TriviallyCopyable objects? Why does the standard deem it necessary to specify that?

UPDATE

The contents of http://en.cppreference.com/w/cpp/string/byte/memcpy have been modified in response to this post and the answers to the post. The current description says:

If the objects are not TriviallyCopyable (e.g. scalars, arrays, C-compatible structs), the behavior is undefined unless the program does not depend on the effects of the destructor of the target object (which is not run by memcpy) and the lifetime of the target object (which is ended, but not started by memcpy) is started by some other means, such as placement-new.

PS

Comment by @Cubbi:

@RSahu if something guarantees UB downstream, it renders the entire program undefined. But I agree that it appears to be possible to skirt around UB in this case and modified cppreference accordingly.


Solution

  • Why would the behavior of std::memcpy itself be undefined when used with non-TriviallyCopyable objects?

    It's not! However, once you copy the underlying bytes of one object of a non-trivially copyable type into another object of that type, the target object is not alive. We destroyed it by reusing its storage, and haven't revitalized it by a constructor call.

    Using the target object - calling its member functions, accessing its data members - is clearly undefined[basic.life]/6, and so is a subsequent, implicit destructor call[basic.life]/4 for target objects having automatic storage duration. Note how undefined behavior is retrospective. [intro.execution]/5:

    However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

    If an implementation spots how an object is dead and necessarily subject to further operations that are undefined, ... it may react by altering your programs semantics. From the memcpy call onward. And this consideration gets very practical once we think of optimizers and certain assumptions that they make.

    It should be noted that standard libraries are able and allowed to optimize certain standard library algorithms for trivially copyable types, though. std::copy on pointers to trivially copyable types usually calls memcpy on the underlying bytes. So does swap.
    So simply stick to using normal generic algorithms and let the compiler do any appropriate low-level optimizations - this is partly what the idea of a trivially copyable type was invented for in the first place: Determining the legality of certain optimizations. Also, this avoids hurting your brain by having to worry about contradictory and underspecified parts of the language.