c++windows benchmarking google-benchmark

What is the meaning of Google Benchmark Iteration?

I am working with Google Benchmark to measure the execution time of some code. For example, I wrote the following code to measure its execution time performance.

#include <benchmark/benchmark.h>

// Alternatively, can add libraries using linker options.
#ifdef _WIN32
#pragma comment ( lib, "Shlwapi.lib" )
#ifdef _DEBUG
#pragma comment ( lib, "benchmarkd.lib" )
#else
#pragma comment ( lib, "benchmark.lib" )
#endif
#endif

static void BenchmarkTestOne(benchmark::State& state) {
    int Sum = 0;
    while (state.KeepRunning())
    {
        for (size_t i = 0; i < 100000; i++)
        {
            Sum += i;
        }
    }
}

static void BenchmarkTestTwo(benchmark::State& state) {
    int Sum = 0;
    while (state.KeepRunning())
    {
        for (size_t i = 0; i < 10000000; i++)
        {
            Sum += i;
        }
    }
}

// Register the function as a benchmark
BENCHMARK(BenchmarkTestOne);
BENCHMARK(BenchmarkTestTwo);


// Run the benchmark
BENCHMARK_MAIN();

When the above code has run, it shows me the following results:

Benchmark                 Time             CPU   Iterations
-----------------------------------------------------------
BenchmarkTestOne     271667 ns       272770 ns         2635
BenchmarkTestTwo   27130981 ns     27644231 ns           26

But I couldn't figure out what is the meaning of Iterations here? And also why Time and CPU are different from each other?

Solution

Google Benchmark tries to benchmark each candidate for a similar amount of time, and/or for long enough to get stable results.

The benchmark counts how many iterations it actually did, along with the exact time. A much slower per-iteration benchmark will do far fewer iterations.

The printout is (calculated) per-iteration time, and (counted) iterations of the benchmark function.

It might actually be a count of calls to state.KeepRunning(), but I don't know that level of detail.

Just FYI, your benchmark loops don't return any result or store it to a volatile after the loop, so a compiler could easily optimize away the loop. Also note that signed overflow is UB in C, and your int will pretty definitely overflow.

(Or clang could still optimize those sum loops into a closed form formula based on Gauss's n * (n+1) / 2 but avoiding overflow.)

Benchmarking with optimization disabled is useless; don't do it.