Search code examples
c++move-semanticsrvaluestdmove

What makes moving objects faster than copying?


I have heard Scott Meyers say "std::move() doesn't move anything" ... but I haven't understood what it means.

So to specify my question consider the following:

class Box { /* things... */ };

Box box1 = some_value;
Box box2 = box1;    // value of box1 is copied to box2 ... ok

What about:

Box box3 = std::move(box1);

I do understand the rules of lvalue and rvalue but what I don't understand is what is actually happening in the memory? Is it just copying the value in some different way, sharing an address or what? More specifically: what makes moving faster than copying?

I just feel that understanding this would make everything clear to me. Thanks in advance!

EDIT: Please note that I'm not asking about the std::move() implementation or any syntactic stuff.


Solution

  • As @gudok answered before, everything is in the implementation... Then a bit is in user code.

    The implementation

    Let's assume we're talking about the copy-constructor to assign a value to the current class.

    The implementation you'll provide will take into account two cases:

    1. the parameter is a l-value, so you can't touch it, by definition
    2. the parameter is a r-value, so, implicitly, the temporary won't live much longer beyond you using it, so, instead of copying its content, you could steal its content

    Both are implemented using an overload:

    Box::Box(const Box & other)
    {
       // copy the contents of other
    }
    
    Box::Box(Box && other)
    {
       // steal the contents of other
    }
    

    The implementation for light classes

    Let's say your class contains two integers: You can't steal those because they are plain raw values. The only thing that would seem like stealing would be to copy the values, then set the original to zero, or something like that... Which makes no sense for simple integers. Why do that extra work?

    So for light value classes, actually offering two specific implementations, one for l-value, and one for r-values, makes no sense.

    Offering only the l-value implementation will be more than enough.

    The implementation for heavier classes

    But in the case of some heavy classes (i.e. std::string, std::map, etc.), copying implies potentially a cost, usually in allocations. So, ideally, you want to avoid it as much as possible. This is where stealing the data from temporaries becomes interesting.

    Assume your Box contains a raw pointer to a HeavyResource that is costly to copy. The code becomes:

    Box::Box(const Box & other)
    {
       this->p = new HeavyResource(*(other.p)) ; // costly copying
    }
    
    Box::Box(Box && other)
    {
       this->p = other.p ; // trivial stealing, part 1
       other.p = nullptr ; // trivial stealing, part 2
    }
    

    It's plain one constructor (the copy-constructor, needing an allocation) is much slower than another (the move-constructor, needing only assignments of raw pointers).

    Note: Be careful when stealing member variables

    Depending on how your object is designed, your implementation of the "steal" might change.

    In the example above, Box::p is assumed to be a raw pointer, owned by the Box class. Thus, to avoid a double-delete bug, the original other.p is set to nullptr. The two lines is the equivalent of "stealing"

    Should Box::p be a correctly moveable type (e.g. std::unique_ptr<T>), the two lines could be replaced by a std::move(other.p), assuming is already implemented that way (e.g. in std::unique_ptr<T>), and will transparently "empty" the other.p value.

    The important part is the following: Unless you know what you're doing, you MUST make sure the value is "stolen" (i.e. the other instance does have the equivalent of an "empty" p member variable).

    Thanks to @dfherr for pointing this out.

    When is it safe to "steal"?

    The thing is: By default, the compiler will invoke the "fast code" only when the parameter is a temporary (it's a bit more subtle, but bear with me...).

    Why?

    Because the compiler can guarantee you can steal from some object without any problem only if that object is a temporary (or will be destroyed soon after anyway). For the other objects, stealing means you suddenly have an object that is valid, but in an unspecified state, which could be still used further down in the code. Possibly leading to crashes or bugs:

    Box box3 = static_cast<Box &&>(box1); // calls the "stealing" constructor
    box1.doSomething();         // Oops! You are using an "empty" object!
    

    But sometimes, you want the performance. So, how do you do it?

    ##The user code

    As you wrote:

    Box box1 = some_value;
    Box box2 = box1;            // value of box1 is copied to box2 ... ok
    Box box3 = std::move(box1); // ???
    

    What happens for box2 is that, as box1 is a l-value, the first, "slow" copy-constructor is invoked. This is the normal, C++98 code.

    Now, for box3, something funny happens: The std::move does return the same box1, but as a r-value reference, instead of a l-value. So the line:

    Box box3 = ...
    

    ... will NOT invoke copy-constructor on box1.

    It will invoke INSTEAD the stealing constructor (officially known as the move-constructor) on box1.

    And as your implementation of the move constructor for Box does "steal" the content of box1, at the end of the expression, box1 is in a valid but unspecified state (usually, it will be empty), and box3 contains the (previous) content of box1.

    What about the valid but unspecified state of a moved-out class?

    Of course, writing std::move on a l-value means you make a promise you won't use that l-value again. Or you will do it, very, very carefully.

    Quoting the C++17 Standard Draft (C++11 was: 17.6.5.15):

    20.5.5.15 Moved-from state of library types [lib.types.movedfrom]

    Objects of types defined in the C++ standard library may be moved from (15.8). Move operations may be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.

    This was about the types in the standard library, but this is something you should follow for your own code.

    What it means is that the moved-out value could now hold any value, from being empty, zero, or some random value. E.g. for all you know, your string "Hello" would become an empty string "", or become "Hell", or even "Goodbye", if the implementer feels it is the right solution. It still must be a valid string, though, with all its invariants respected.

    So, in the end, unless the implementer (of a type) explicitly committed to a specific behavior after a move, you should act as if you know nothing about a moved-out value (of that type).

    ##Conclusion

    As said above, the std::move does nothing. It only tells the compiler: "You see that l-value? please consider it a r-value, just for a second".

    So, in:

    Box box3 = std::move(box1); // ???
    

    ... the user code (i.e. the std::move) tells the compiler the parameter can be considered as a r-value for this expression, and thus, the move constructor will be called.

    For the code author (and the code reviewer), the code actually tells it is ok to steal the content of box1, to move it into box3. The code author will then have to make sure box1 is not used anymore (or used very very carefully). It is their responsibility.

    But in the end, it is the implementation of the move constructor that will make a difference, mostly in performance: If the move constructor actually steals the content of the r-value, then you will see a difference. If it does anything else, then the author lied about it, but this is another problem...