Paradoxal timing functions

I have a function to compute the time spent in a block:

import collection.mutable.{Map => MMap}
var timeTotalMap = MMap[String, Long]()
var numMap = MMap[String, Float]()
var averageMsMap = MMap[String, Float]()
def time[T](key: String)(block: =>T): T = {
  val start = System.nanoTime()
  val res = block
  val total = System.nanoTime - start
  timeTotalMap(key) = timeTotalMap.getOrElse(key, 0L) + total
  numMap(key) = numMap.getOrElse(key, 0f) + 1
  averageMsMap(key) = timeTotalMap(key)/1000000000f/numMap(key)
  res
}

I am timing a function and the place where it is called in one place.

time("outerpos") { intersectPos(a, b) }

and the function itself starts with:

def intersectPos(p1: SPosition, p2: SPosition)(implicit unify: Option[Identifier] =None): SPosition
    = time("innerpos") {(p1, p2) match {
   ...
}

When I display the nano times for each key (timeTotalMap), I get (added spaces for lisibility)

outerpos->  37 870 034 714
innerpos->      53 647 956

It means that the total execution time of innerpos is 1000 less than the one of outerpos.

What ! there is a factor 1000 between the two ? And its says that the outer call takes 1000x more time than all the inner functions? am I missing something or is there a memory leaking ghost somewhere ?

Update

When I compare the number of execution for each block (numMap), I find the following:

outerpos -> 362878
innerpos -> 21764

This is paradoxal. Even if intersectPos is called somewhere else, shouldn't the number of times it is called be as great as the number of times outerpos is called?

EDIT

If I move the line numMap(key) = numMap.getOrElse(key, 0f) + 1 to the top of the time functinon, then these numbers become approximately equal.

Solution

nanoTime on JVM is considered safe but not very accurate. Here are some good reads:

Is System.nanoTime() completely useless?

System.currentTimeMillis vs System.nanoTime

Basically your test will suffer from timer inaccuracy. One way to work around that would be to call time("outerpos") for a quite a long time (note JIT compiler optimization might kick in at some point) and measure start and end times only once. Take an average and repeat with time("innerpos"). That's all I could think of. Still not the best test ever ;)