What is the execution time of a parallel OpenMP application?

Let's say that we run an OpenMP application with N threads and for each thread we record its execution time using omp_get_wtime. In this way we get N execution times, one time for each thread. Which one of those times can be used as the execution time of the parallel application in order to be used, for example, in the estimation of the speedup value? The smallest one? the largest one? the average of all those times? It is not practical to keep all these times and i wonder if there is a simple value to use instead of them.

Solution

The execution time is exactly that. It's the time taken from the start of execution of the program to the end of that execution. You can measure it (on Linux, using the time command with no need to do any internal instrumentation of the code).

Any internal measurements of execution time for individual threads or particular code regions may be useful for understanding load-imbalance and other reasons for poor scaling, but the absolute, wall-clock, time to execute the whole code is what matters most.

p.s. If you're going to plot scaling performance, my CpuFun blogs on "Presenting Parallel Performance" may be interesting.