Search code examples
c++iostdoutcoutfmt

How to speed up character output?


I want to make this for loop run faster by optimizing the I/O:

for ( int row = 0; row < Y_AxisLen; ++row )
{
    for ( int col = 0; col < X_AxisLen; ++col )
    {
        std::cout << characterMatrix[ row ][ col ];
    }
}

std::vector< std::vector<char> > characterMatrix; is a matrix and I need to print it out. Is printing one char at a time bad for performance? Also, should I use the {fmt} library instead of std::cout?

Will this work faster?

std::array<char, X_AxisLen> rowStr { };

for ( int row = 0; row < Y_AxisLen; ++row )
{
    for ( int col = 0; col < X_AxisLen; ++col )
    {
        rowStr[ col ] = characterMatrix[ row ][ col ];
    }

    std::cout << rowStr.data( );
    // fmt::print( "{}", rowStr.data( ) ); // Or using this one. But will this even work?
}

Solution

  • You can write the whole row in one go and use fmt::print for better performance:

    #include <fmt/core.h>
    
    #include <vector>
    
    int main() {
      auto X_AxisLen = 10000u;
      auto Y_AxisLen = 10000u;
      auto characterMatrix =
          std::vector<std::vector<char>>(X_AxisLen, std::vector<char>(Y_AxisLen));
      for (int i = 0; i < Y_AxisLen; ++i) {
        const auto& row = characterMatrix[i];
        fmt::print("{}", std::string_view(row.data(), row.size()));
      }
    }
    
    % c++ test.cc -O3 -DNDEBUG -std=c++17 -I include src/format.cc -o test-fmt
    % time ./test-fmt > /dev/null
    ./test-fmt > /dev/null  0.03s user 0.04s system 52% cpu 0.135 total
    

    For comparison, this is ~30 times (not percent) faster than using cout and writing character by character:

    #include <iostream>
    #include <vector>
    
    int main() {
      auto X_AxisLen = 10000u;
      auto Y_AxisLen = 10000u;
      auto characterMatrix =
          std::vector<std::vector<char>>(X_AxisLen, std::vector<char>(Y_AxisLen));
      for (int row = 0; row < Y_AxisLen; ++row) {
        for (int col = 0; col < X_AxisLen; ++col) {
          std::cout << characterMatrix[row][col];
        }
      }
    }
    
    % c++ test.cc -O3 -DNDEBUG -std=c++17 -I include src/format.cc -o test-cout
    % time ./test-cout > /dev/null
    ./test-cout > /dev/null  4.30s user 0.08s system 95% cpu 4.581 total
    

    This example is a bit artificial, in real world the difference may not be as dramatic particularly if you turn off sync with stdio. However, the {fmt} result can also be improved by using format string compilation and the unsynchronized API (if you are writing to a file).