Why does a function run faster the more I call it?

I'm trying to time the execution of a function before attempting to optimize it. (The code is Elixir, but I'm using Erlang's :timer.tc.)

My general approach is "run it lots of times, then calculate the average duration." But the average decreases dramatically the more times I run it (up to a point).

An example:

some_func = fn ->
  # not my actual function; it's a pure function,
  # but exhibits the same speedup
  :rand.uniform() 
end

run_n_times = fn (count, func) ->
  Enum.each(1..count, fn (_i) ->
    func.()
  end)
end

n = 20

{microseconds, :ok} = :timer.tc(run_n_times, [n, some_func])
IO.puts "#{microseconds / n} microseconds per call (#{microseconds} total for #{n} calls)"

Outputs for increasing values of n are like this (lightly formatted):

174.8       microseconds per call (3496    total for 20      calls )
21.505      microseconds per call (4301    total for 200     calls )
4.5755      microseconds per call (9151    total for 2000    calls )
0.543415    microseconds per call (108683  total for 200000  calls )
0.578474    microseconds per call (578474  total for 1000000 calls )
0.5502955   microseconds per call (1100591 total for 2000000 calls )
0.556457    microseconds per call (2225828 total for 4000000 calls )
0.544754125 microseconds per call (4358033 total for 8000000 calls )

Why does a function run faster the more I call it, and what does this imply for benchmarking? Eg, is there a rule of thumb like "run something >= 200k times in order to benchmark"?

Solution

Since your function is very fast (does nothing basically) what I think you're seeing here is the overhead of the setup and not any speedup in the runtime of the function. In this case before you start running your function you have to construct a range, construct an anonymous function and call the Enum.each function. For small numbers of repetitions these factors probably contribute more to the overall runtime of the benchmark than the actual repetitions.