Search code examples
c++iteratorgeneratorcoroutineyield-keyword

"yield" keyword for C++, How to Return an Iterator from my Function?


Consider the following code.

std::vector<result_data> do_processing() 
{
    pqxx::result input_data = get_data_from_database();
    return process_data(input_data);
}

std::vector<result_data> process_data(pqxx::result const & input_data)
{
    std::vector<result_data> ret;
    pqxx::result::const_iterator row;
    for (row = input_data.begin(); row != inpupt_data.end(); ++row) 
    {
        // somehow populate output vector
    }
    return ret;
}

While I was thinking about whether or not I could expect Return Value Optimization (RVO) to happen, I found this answer by Jerry Coffin [emphasis mine]:

At least IMO, it's usually a poor idea, but not for efficiency reasons. It's a poor idea because the function in question should usually be written as a generic algorithm that produces its output via an iterator. Almost any code that accepts or returns a container instead of operating on iterators should be considered suspect.

Don't get me wrong: there are times it makes sense to pass around collection-like objects (e.g., strings) but for the example cited, I'd consider passing or returning the vector a poor idea.

Having some Python background, I like Generators very much. Actually, if it were Python, I would have written above function as a Generator, i.e. to avoid the necessity of processing the entire data before anything else could happen. For example like this:

def process_data(input_data):
    for item in input_data:
        # somehow process items
        yield result_data

If I correctly interpreted Jerry Coffins note, this is what he suggested, isn't it? If so, how can I implement this in C++?


Solution

  • No, that’s not what Jerry means, at least not directly.

    yield in Python implements coroutines. C++ doesn’t have them (but they can of course be emulated but that’s a bit involved if done cleanly).

    But what Jerry meant is simply that you should pass in an output iterator which is then written to:

    template <typename O>
    void process_data(pqxx::result const & input_data, O iter) {
        for (row = input_data.begin(); row != inpupt_data.end(); ++row)
            *iter++ = some_value;
    }
    

    And call it:

    std::vector<result_data> result;
    process_data(input, std::back_inserter(result));
    

    I’m not convinced though that this is generally better than just returning the vector.