Search code examples
c++rvalue-reference

"Transfer" to function in C++: pass by value or rvalue reference?


This question discusses passing by value vs. passing by rvalue reference in C++. However I find the answers unsatisfactory and not entirely correct.

Let's say I want to define a Queue abstract interface:

  • Queue is templated on the object it accepts
  • It should be possible to store noncopyable types in the Queue
  • The Queue may have multiple implementations.
  • Queue implementations should not be precluded from the opportunity of avoiding copies, if they can avoid them.

I wish for my interface to express the intent that calling Queue<T>::push(value) implies the "transfer" of value to the queue. By "transfer" I mean that:

  1. Queue::push is allowed to do whatever it wants with value, including modifying it, and the user should not be affected.
  2. After a call to Queue::push, if the user uses value, it should not have side effects for the Queue.

My options for the Queue interface are:

Option 1:

template<typename T>
class Queue {
public:
   virtual void push(const T& value) = 0;
};

Option 2:

template<typename T>
class Queue {
public:
   virtual void push(T&& value) = 0;
};

Option 3:

template<typename T>
class Queue {
public:
   virtual void push(T value) = 0;
};

The C++ core guidelines don't help much:

  • F.16 says "For “in” parameters, pass cheaply-copied types by value and others by reference to const". Here "value" is an in parameter, so the guidelines say I should pass it by reference to const. But this requires "T" to be copyable, which is an unnecessary restriction. Why shouldn't I be able to pass a std::unique_ptr to push?
  • F.18 says that "For “will-move-from” parameters, pass by X&& and std::move the parameter". First of all, what is a “will-move-from” parameter? Secondly, as I'm defining the Queue interface, I have no idea of what the implementation will do. It may move the object, or it may not. I don't know.

So what should I do?

  • Option 1: this does not express the right semantic IMO. With this option the code std::unique_ptr<int> ptr; queue.push(std::move(ptr)); does not work, but there's no reason why it shouldn't.
  • Option 2: this should have the right semantic, but it forces the user to either move or copy explicitly the value. But why should Queue force the user to do so? Why should for example Queue forbid the following code? std::shared_ptr<int> ptr; queue.push(ptr);
  • Option 3: it allows to push a copy of a value, or to "move-in" a value. So std::shared_ptr<int> ptr; queue.push(ptr); is valid, and so is std::unique_ptr<int> ptr; queue.push(std::move(ptr)); The required semantic holds. However, nothing stops the user from calling Storage<std::vector<int>>::store(veryLargeVector), which may cause a large, unnecessary copy.

Solution

  • You expressed some concern about the following making an expensive copy:

    queue.push(veryLargeVector);
    

    But the problem is that there are only two things that this code can do. It can be made ill-formed, or it can copy veryLargeVector. Even if you take the argument by const reference (instead of by value) you will still need to make a copy from that reference into the queue's storage buffer.

    There is no way for the queue template to detect whether or not it will be expensive to make a copy of the type it was instantiated with. You basically have to make the same choice for all types: should queue.push(x); make a copy or should it be ill-formed? And if you decide to make it ill-formed, then queue.push(sharedPtr) will also be ill-formed, but you wanted to support that.

    And in fact queue.push(sharedPtr) should work. There is no widely-used generic container anywhere that only accepts rvalues (with the exception of ones that only store move-only types like std::unique_ptrs). If your API only accepts std::shared_ptrs and even ints by rvalue and not by lvalue, it will deviate from universal practice in the C++ community. Your users will find it highly annoying to have to write queue.push(int(x)) instead of queue.push(x) every time they want to copy an int into the queue. If you do not want to create this problem for users, you have to be willing to accept lvalues. And if you do accept lvalues, you have to copy from them. No other choice.

    So there are really two options:

    1. Provide a pair of overloads: push(const T&) and push(T&&). This is what the standard library does.
    2. Provide a single overload, push(T). This means that with an lvalue argument, there will be a copy and a move, and with an rvalue argument, two moves.

    If you expect almost all types that will be used with your queue to be ones that are cheaply movable, the second strategy will be more convenient.