Search code examples
c++timeexecution

Execution time of a short function c++


I want to calculate the execution time of a pretty small function to compare the execution time of a recursive function vs iterative.

Surely, clock() simply cannot do that with not enough resolution. Could you show me how can I use another time sources like GetThreadTimes(). I saw a description on the Microsoft website, but did not catch the principle.

Also, <chrono> header does not work in MS Visual 10.

The code:

int search (int a[], int size, int& num) {

if (size >0) {

    if (a[size-1] == 17) {num = size-1; return num;}
    else {return search (a, --size, num);}}
else {return num=-1;};
}

int searchit (int a[], int size, int& num) {

    for (int i =0; i< size; i++) {
        if (a[i] == 17) {num = i;}
        else num = -1;
    }
    return num;}


int main () {
    srand ((unsigned int) time(0));
    int num = 0;
    const int size = 40;
    int a[size];
    for (int i =0; i< size; i++) {

        a[i] = rand()%100;
        cout << a[i] << endl;}
    cout << '\n';

    search (a, size, num);

    cout << num << endl;

    cin.get();
    cin.ignore();
}

Solution

  • One solution would be to do many iterations as suggested by @amchacon. This has the advantage of being straightforward and easy.

    It has the disadvantage of possibly leading to inaccurate or incorrect results both due to the compiler using different heuristics for inlining and/or instruction pipelining, and due to both instruction and data caches having warmed up after the first iteration.
    So while your function might indeed have pretty poor performance due to a bad memory access pattern (maybe resulting in two dozen cache misses costing you 500 cycles each), this might not at all show when you run the function a hundred times, if the total set of cache lines fits into the caches.

    What are the alternatives?

    a) Not applicable for your problem (since you want to test a recursive algorithm), but I'll state it anyway for the "general" case: Use IACA. It's especially designed for the purpose of micro-benchmarking a small section of code down to the instruction.

    b) Use a higher precision timer or use a timer that isn't a timer at all. For this purpose, you have QueryPerformanceCounter and QueryThreadCycleTime (Vista and later) available under Windows. Cycles may be preferrable to time, depending on what you want to measure.

    c) Query the thread times. This is in my opinion the best way, since you get reliable, precise, accurate times (much unlike timers, which may include context switches and time spent in other processes!), and it works for any kind of code, letting you distinguish kernel and user time in case your code calls system functions, and differentiate CPU and wall time.
    Call GetThreadTimes once before and once after running your function and subtract the respective UserTime and KernelTime values.
    Or, start a worker thread if you are also interested in wall time (for wall time, you will subtract CreationTime from ExitTime, and you obviously only get a valid ExitTime after the thread has exited!). Calculating wall time may be useful if your code also involves blocking I/O operations.