Search code examples
c++stringfunctionvectorreturn

Scope problem in the example code to split std::string to std::vector of strings using delimiter


I have found this the code example while searching for code snippet to split(similar to PHP's explode) std::string to a vector of substrings using delimiter of ' '(a Space :). String example - "one two three".

std::vector<std::string> split(const std::string& s, char delimiter)
{
   std::vector<std::string> tokens;
   std::string token;
   std::istringstream tokenStream(s);
   while (std::getline(tokenStream, token, delimiter))
   {
      tokens.push_back(token);
   }
   return tokens;
}

My problem is with scope of variable 'tokens'. Will it be an error to use such a split function because the scope of local variable ends once the function returns. I have an idea how to correct the problem, i am just not sure in my c++ skill. I am curious in ways of doing it in standards up to C++0x for use of such: explode(string, delimiter).


Solution

  • The tokens variable will indeed not survive the return of the function. But the return is by value, and the value returned survives in the calling context.

    For the optimization, the best would be to let the compiler do its job and only fine tune if necessary. Here, the compiler may generate:

    1) a copy elision, constructing the return value directly into its target. Example: auto r=split(s, ' ');
    2) a move assignment if the target was already previously constructed. Example:
    r=split(s, ' ');

    Both cases avoid unnecessary copy of data. You can have a look at the Test class here for understanding.

    Returning by reference would be UB, since the reference would refer to a variable that no longer exists. So returning by reference would mean to use a reference parameter to write directly in to the right target variable. But this will not outperform the copy elision. And it would probably only rarely outperform a move assignment. But in case of a doubt, you can try to make a benchmark.