Search code examples
c++io

Why is printing one character at a time slower than concatenating them in advance?


Say, having an std::vector of all lowercase chars repeated 1000 times and printing them first char-by-char and then by concatenating reveals that the second method is at least twice as fast:

// Averaged by 1000 measurements
Chars mean: 1.314961314958 ms
Joined mean: 0.430487430487 ms

Output methods themselves:

void print_char(const std::vector<char>& chars, std::ostream& os)
{
    for (const char x : chars)
        os << x;
}

void print_join(const std::vector<char>& chars, std::ostream& os)
{
    std::string joined;
    joined.reserve(chars.size());
    std::copy(chars.begin(), chars.end(), std::back_inserter(joined));
    os << joined;
}

Why is it so? I thought built-in IO buffering does the same job as that accumulator string joined


Solution

  • You have a lot more than just I/O here - you have 1,000 calls to os's operator<<.

    As a result, every single thing that happens when you stream some data — checking the stream's initial state to satisfy preconditions, for example, or locking a mutex to get thread-safety during the call — must now happen 1,000 times.

    When you just stream one string, that needs only happen once. It is likely of course that the stream is doing some state management inside it while blatting the string, but there will be all sorts of things it doesn't need to repeat.

    Even taking stream operation out of the equation for a moment, unless you have optimisations turned on and that call gets inlined, that's 1,000 function calls you don't need.