I want to make this for loop run faster by optimizing the I/O:
for ( int row = 0; row < Y_AxisLen; ++row )
{
for ( int col = 0; col < X_AxisLen; ++col )
{
std::cout << characterMatrix[ row ][ col ];
}
}
std::vector< std::vector<char> > characterMatrix;
is a matrix and I need to print it out. Is printing one char
at a time bad for performance? Also, should I use the {fmt}
library instead of std::cout
?
Will this work faster?
std::array<char, X_AxisLen> rowStr { };
for ( int row = 0; row < Y_AxisLen; ++row )
{
for ( int col = 0; col < X_AxisLen; ++col )
{
rowStr[ col ] = characterMatrix[ row ][ col ];
}
std::cout << rowStr.data( );
// fmt::print( "{}", rowStr.data( ) ); // Or using this one. But will this even work?
}
You can write the whole row in one go and use fmt::print
for better performance:
#include <fmt/core.h>
#include <vector>
int main() {
auto X_AxisLen = 10000u;
auto Y_AxisLen = 10000u;
auto characterMatrix =
std::vector<std::vector<char>>(X_AxisLen, std::vector<char>(Y_AxisLen));
for (int i = 0; i < Y_AxisLen; ++i) {
const auto& row = characterMatrix[i];
fmt::print("{}", std::string_view(row.data(), row.size()));
}
}
% c++ test.cc -O3 -DNDEBUG -std=c++17 -I include src/format.cc -o test-fmt
% time ./test-fmt > /dev/null
./test-fmt > /dev/null 0.03s user 0.04s system 52% cpu 0.135 total
For comparison, this is ~30 times (not percent) faster than using cout
and writing character by character:
#include <iostream>
#include <vector>
int main() {
auto X_AxisLen = 10000u;
auto Y_AxisLen = 10000u;
auto characterMatrix =
std::vector<std::vector<char>>(X_AxisLen, std::vector<char>(Y_AxisLen));
for (int row = 0; row < Y_AxisLen; ++row) {
for (int col = 0; col < X_AxisLen; ++col) {
std::cout << characterMatrix[row][col];
}
}
}
% c++ test.cc -O3 -DNDEBUG -std=c++17 -I include src/format.cc -o test-cout
% time ./test-cout > /dev/null
./test-cout > /dev/null 4.30s user 0.08s system 95% cpu 4.581 total
This example is a bit artificial, in real world the difference may not be as dramatic particularly if you turn off sync with stdio. However, the {fmt} result can also be improved by using format string compilation and the unsynchronized API (if you are writing to a file).