Search code examples
javaarraysarraylisttime-complexitybenchmarking

Are arrays supposed to be faster than ArrayLists by this much?


I have implemented two methods, shuffleList and shuffleArray which use the exact same functionality. And I have an ArrayList of half millions integer, and an array of the same half million ints. In my benchmarking code, which performs each one of the methods a 100 times on the corresponding array or ArrayList and records the time, it looks like shuffleArray takes around 0.5 seconds while shuffleList takes around 3.5 seconds, even though the code does not use any ArrayList methods but get and set, which are supposed to work as fast as they work in an array.

Now I know that ArrayLists are a little bit slower because they internally use arrays but with some additional code, but does it make this big of a difference?

     void shuffleList(List<Integer> list){
        Random rnd = ThreadLocalRandom.current();
        for(int i=list.size()-1;i>0;i--){
            int index=rnd.nextInt(i+1);
            int a=list.get(index);
            list.set(index,list.get(i));
            list.set(i,a);
        }
    }

    void shuffleArray(int[] ar)
    {
        Random rnd = ThreadLocalRandom.current();
        for (int i = ar.length - 1; i > 0; i--)
        {
            int index = rnd.nextInt(i + 1);
            int a = ar[index];
            ar[index] = ar[i];
            ar[i] = a;
        }
    }

Benchmarking code:

import org.openjdk.jmh.Main;
import org.openjdk.jmh.annotations.*;

@BenchmarkMode(Mode.AverageTime)
public class MyBenchmark {


    @Benchmark
    @Fork(value = 1)
    @Warmup(iterations = 3)
    @Measurement(iterations = 10)
    public void compete() {
        try {
            Sorting sorting = new Sorting();
            sorting.load();
            System.out.println(sorting.test());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws Exception {
        Main.main(args);
    }
}


    protected List<Integer> list = new ArrayList<Integer>();
    protected List<int[]> arrays= new ArrayList<>();

    protected void load(){
        try (Stream<String> stream = Files.lines(Paths.get("numbers.txt"))) {
            stream.forEach(x -> list.add(Integer.parseInt(x)));
        } catch (IOException e) {
            e.printStackTrace();
        }
        finally{
            int[] arr =new int[list.size()];
            for(int i=0;i<list.size();i++)
                arr[i]=list.get(i);
            arrays.add(arr);
        }
    }

    protected double test(){
        int arr[]=arrays.get(0);
        Stopwatch watch = new Stopwatch();
        for (int i=0; i<100; i++){
          shuffleArray(arr);
          shuffleList(list);
        }
        return watch.elapsedTime();
    }

I comment out one of the methods on the for loop and use the other.

Update:

I did what a lot of you suggested of changing Int a to Integer a in the shuffleList method, and it is making it a little bit faster, it is 3 seconds instead of 3.5 now, but I still think it is a big difference.

It is worth mentioning that changing int[] arr to Integer[] arr in the shuffleArray method with keeping int a as it is to simulate the boxing and unboxing time for the array does actually make it a lot slower, it makes it take around 3 seconds, so I can make the array as slow as the ArrayList but I can not do the opposite.

Update:

Using Collections.swap() in shuffleList did indeed make it as fast as the array, but I still do not understand why, is my benchmarking too sensetive or does it really matter?

Final shuffleList code, courtesy of Andy Turner and Joop Eggen:

    protected void shuffleList(List<Integer> list){
        Random rnd = ThreadLocalRandom.current();
        for(int i=list.size()-1;i>0;i--){
            int index=rnd.nextInt(i+1);
            Collections.swap(list, i, index);
        }
    }


Solution

  • Use Integer a, which save one unboxing and one boxing operation.

        for (int i = list.size()-1; i>0; i--){
            int index=rnd.nextInt(i+1);
            Integer a=list.get(index);
            list.set(index,list.get(i));
            list.set(i,a);
        }
    

    And the Integer objects use more memory.


    @Andy Turner mentioned the exist Collections#swap.

        for (int i = list.size()-1; i > 0; i--) {
            int index = rnd.nextInt(i+1);
            Collections.swap(list, i, index);
        }
    

    Without warm-up of JIT compiler this might slow-down the bench-mark, but will look better in production code. Though then you would probably use the Collections.shuffle anyway.


    As commented the swap version is fast too. First the OP showed using the right microbenchmarking code.

    swap uses the original Integer class too. It does l.set(i, l.set(j, l.get(i))); in order to swap - as set returns the previous element at that position. The JIT compiler can probably unwrap set and utilize that previous element immediately.