Should we measure the average or the minimum execution time of a routine?

When we measure the execution time of a routine (always using the same input) we don't get the same result (time) even if we should because the program runs in a multitasking environment. Even if we run our program as 'high priority' some interference from other programs running in the CPU, will of course affect the results slightly.

The purpose of the measurement is to time that routine before and after code optimization.

Most people will time the function multiple times and make an average. Why don't we look at the smallest execution time instead of the average?

Solution

You should always aim to time the minimum time.
Because if you do you are sure you're only timing your own code and nothing else.

Aim for minimum time
If your code has only a single execution path then you should always take the minimum time (out of many many repetitions) as the actual time taken.
This way you can get timings accurate to within one or two CPU cycles. Just to be clear, you run the code snippet for millions of runs and take the lowest sample of that run as the timing.
Then you wrap those millions of runs in a loop that runs 10x or 100x and again take the lowest timing. Like so:

Lowest = MaxInt;
loop 100x 
  loop million times
     Clock.Start;
     DoTest;
     Timing = Clock.Time;
     if (timing < Lowest) {Lowest = timing}

The other loop resets the context which sometimes helps. This matters e.g. if the JIT compiler kicks in late. The outer loop gives it a change to reset.

You can also your timings in the outer loop and then divide by a million if the code snippet is especially fast. In that case you'll run an extra empty timing loop and subtract the time taken by the empty loop from the time taken in the busy loop.
You'll have to get clever though to prevent code optimization from eliminating the empty loop :-).

If your code has multiple possible paths, then you can't really time its execution time. Running just a simple loop with fixed input, because that will only give you a partial time of just one code path. This is probably not representative of real world performance.

Make your runs deterministic
Always try and fix the code so there is only one path the code can take.
Or try and set up the test so all possible paths are taken in succession and then time the minimum time taken for everything and divide by the number of code paths tested.

Everything and the kitchen sink profiling
If that's not possible you'll have to take the average, but note that in that case you're not really timing just your code anymore, you're also taking system overhead, HDD interruptions, and background processes into account.