Search code examples
c++functionreturnperformancecode-readability

Return vs. Not Return of functions?


Return or not return, it's a question for functions! Or, does it really matter?


Here goes the story: I used to write code like the following:

Type3 myFunc(Type1 input1, Type2 input2){}

But recently my project colleges told me that I should try, as mush as possible, to avoid writing function like this, and suggest the following way by putting the returned value in the input parameters.

void myFunc(Type1 input1, Type2 input2, Type3 &output){}

They convinced me that this is better and faster because of the extra copying step when returning in the first method.


For me, I start to believe that the second method is better in some situations, especially I have multiple things to return or modify. For example: the second line of following will be better and faster than the first one as avoiding copying the whole vecor<int> when returning.

vector<int> addTwoVectors(vector<int> a, vector<int> b){}
void addTwoVectors(vector<int> a, vector<int> b, vector<int> &result){}:

But, in some other situations, I cannot buy it. For example,

bool checkInArray(int value, vector<int> arr){}

will be definitely better than

void checkInArray(int value, vector<int> arr, bool &inOrNot){}

In this case, I think the first method by directly return the result is better in terms of better readability.


In summary, I am confused about (emphasis on C++):

  • What should be returned by functions and what should not (or try to avoid)?
  • Is there any standard way or good suggestions for me to follow?
  • Can we do better in both in readability and in code efficiency?

Edit: I am aware of that, under some conditions, we have to use one of them. For example, I have to use return-type functions if I need to achieve method chaining. So please focus on the situations where both methods can be applied to achieve the goal.

I know this question may not have a single answer or sure-thing. Also it seems this decision need to be made in many coding languages, like C, C++, etc. Thus any opinion or suggestion is much appreciated (better with examples).


Solution

  • As always when someone brings the argument that one thing is faster than the other, did you take timings? In fully optimized code, in every language and every compiler you plan to use? Without that, any argument based on performance is moot.

    I’ll come back to the performance question in a second, just let me address what I think is more important first: There are good reasons to pass function parameters by reference, of course. The primary one I can think of right now is that the parameter is actually input and output, i.e., the function is supposed to operate on the existing data. To me, that is what a function signature taking a non-const reference indicates. If such a function then ignores what is already in that object (or, even worse, clearly expects to only ever get a default-constructed one), that interface is confusing.

    Now, to come back to performance. I cannot speak for C# or Java (though I believe returning an object in Java would not cause a copy in the first place, just passing around a reference), and in C, you do not have references but might need to resort to passing pointers around (and then, I do agree that passing in a pointer to uninitialized memory is ok). But in C++, compilers have for a long time done return value optimization, RVO, which basically just means that in most calls like A a = f(b);, the copy constructor is bypassed and f will create the object directly in the right place. In C++11, we even got move semantics to make this explicit and use it in more places.

    Should you just return an A* instead? Only if you really long for the old days of manual memory management. At the very least, return an std::shared_ptr<A> or an std::unique_ptr<A>.

    Now, with multiple outputs, you get additional complications, of course. The first thing to do is if your design is actually proper: Each function should have a single responsibility, and usually, that means returning a single value as well. But there are of course exceptions to this; e.g., a partitioning function will have to return two or more containers. In that situation, you may find that the code is easier to read with non-const reference arguments; or, you may find that returning a tuple is the way to go.

    I urge you to write your code both ways, and come back the next day or after a weekend and look at the two versions again. Then, decide what is easier to read. In the end, that is the primary criterion for good code. For those few places where you can see a performance difference from an end-user workflow, that is an additional factor to consider, but only in very rare cases should it ever take precedence over readable code – and with a little more effort, you can usually get both to work anyway.