Search code examples
javatimesetexecution

common techniques to measure execution time provide different values (java)


I was trying to measure the time of different operations on different type of sets and wanted to compare them, but the values I get differ very much, on the same type of set, like factor 1000. I use the common techniques I read here: How do I time a method's execution in Java?

I compared Hashset, TreeSet and LinkedHashSet. I filled the sets with 1 000 000 integers, used the methode contains() and iterated through the sets. I measured the time on each operations and the values differed very much. So I did this a second time with new sets of the same type and the execution times i get don't seem to be legit.

The same type of set need once 1400 milliseconds then 300 milliseconds to be filled. Why is that?

Here is a code sample, it may make it clearer what i mean:

    public static void main(String[] args){

    HashSet<Integer> firstHashSet = new HashSet<>(predefinedSize);
    HashSet<Integer> secondHashSet = new HashSet<>(predefinedSize);
    LinkedHashSet<Integer> firstLinkedHashSet = new LinkedHashSet<>(predefinedSize);
    LinkedHashSet<Integer> secondLinkedHashSet = new LinkedHashSet<>(predefinedSize);
    TreeSet<Integer> firstTreeSet = new TreeSet<>();
    TreeSet<Integer> secondTreeSet = new TreeSet<>();
    int x = 9432;
    System.out.println("filling hashSet:        <" + fillSet(firstHashSet) + "> milliSeconds");
    System.out.println("filling linkedSet:      <" + fillSet(firstLinkedHashSet) + "> milliSeconds");
    System.out.println("filling treeSet:        <" + fillSet(firstTreeSet) + "> milliSeconds");
    System.out.println("-------------------------------------------------------------");
    System.out.println("filling hashSet:        <" + fillSet(secondHashSet) + "> milliSeconds");
    System.out.println("filling linkedSet:      <" + fillSet(secondLinkedHashSet) + "> milliSeconds");
    System.out.println("filling treeSet:        <" + fillSet(secondTreeSet) + "> milliSeconds");

this is what one of my fillset looks like:

private static int size = 1000000;
private static int predefinedSize = 2000000;

public static double fillSet(LinkedHashSet<Integer> myHashSet){
    double timeStart = System.nanoTime();
    for(int i=0; i<size; i++){
        myHashSet.add(i);
    }
    double time = (System.nanoTime() - timeStart)/ Math.pow(10, 6);
    return time;
}

and the output is this:

filling hashSet:        <52.14022> milliSeconds
filling linkedSet:      <95.599435> milliSeconds
filling treeSet:        <2172.773956> milliSeconds
-------------------------------------------------------------
filling hashSet:        <59.096929> milliSeconds
filling linkedSet:      <1006.638126> milliSeconds
filling treeSet:        <241.36395> milliSeconds

you see the output differs very much, I assume it depends on the computing power of my pc, but I dont run any other program on the background. Can someone give me an explanation and/or sollution to this?


Solution

  • As @kan's comment mentioned, using a system timer and executing something a million times will provide wildly varying results. What you're looking for is a microbenchmark:

    How do I write a correct micro-benchmark in Java?

    As for the reasons your timings are all over the place, you'll have to read up about computer architecture and the Java JVM. Some possibilities:

    • Dynamic clock speed technologies in processors https://electronics.stackexchange.com/questions/62353/how-can-a-cpu-dynamically-change-its-clock-frequency - You can eliminate this possibility by turning your CPU's ability to change its clock speed off.
    • Your collection has 1 million elements of type Int which is 4 MiB. That size is just about at the limit of whether it will fit in your processor's cache or not, given that non-server CPUs will have between 1 and 8 MiB of cache. If at one execution your 1 million elements stayed in the cache longer than in another execution, you'll get a wildly different execution time. You can eliminate this possibility by making your collection either so small it definitely fits in cache (tens of kilobytes max), or so large that it won't work with cache at all (maybe a hundred megabytes).
    • You may not be running any other applications, but there's still other stuff running in the background on your computer. (Antivirus, update service, 10-20 other tasks related to your operating system's inner-workings)
    • The Java Virtual Machine's may behave differently (this I can't be too sure about because I'm no expert on the inner-workings of JIT, GC, and other stuff that may affect execution time). The microbenchmark libraries will largely remove this possible variance.