Search code examples
javamultithreadingperformancesortingmergesort

Java multithreaded merge sort speed issue


I'm trying to implement multithreaded merge sort in Java. The idea is to recursively call to new threads on every iteration. Everything works properly, but problem is that regular single-thread version appears to be much more faster. Please, help fixing it. I've tried to play with .join(), but it haven't brought any success. My code:

public class MergeThread implements Runnable {

    private final int begin;
    private final int end;

    public MergeThread(int b, int e) {
        this.begin = b;
        this.end = e;
    }

    @Override
    public void run() {
        try {
            MergeSort.mergesort(begin, end);
        } catch (InterruptedException ex) {
            Logger.getLogger(MergeThread.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}

public class MergeSort {
    private static volatile int[] numbers;
    private static volatile int[] helper;

    private int number;

    public void sort(int[] values) throws InterruptedException {
    MergeSort.numbers = values;
    number = values.length;
    MergeSort.helper = new int[number];
    mergesort(0, number - 1);
    }

    public static void mergesort(int low, int high) throws InterruptedException {
    // check if low is smaller than high, if not then the array
    // is sorted
    if (low < high) {
        // Get the index of the element which is in the middle
        int middle = low + (high - low) / 2;
        // Sort the left side of the array
            Thread left = new Thread(new MergeThread(low, middle));
            Thread right = new Thread(new MergeThread(middle+1, high));

            left.start();
            right.start();
            left.join();
            right.join();

        // combine the sides
        merge(low, middle, high);
    }
}

    private static void merge(int low, int middle, int high) {
    // Copy both parts into the helper array
    for (int i = low; i <= high; i++) {
        helper[i] = numbers[i];
    }

    int i = low;
    int j = middle + 1;
    int k = low;
    // Copy the smallest value from either the left or right side
    // back to the original array
    while (i <= middle && j <= high) {
        if (helper[i] <= helper[j]) {
        numbers[k] = helper[i];
        i++;
        } else {
        numbers[k] = helper[j];
        j++;
        }
        k++;
    }
    // Copy the rest of the left side of the array
    while (i <= middle) {
        numbers[k] = helper[i];
        k++;
        i++;
    }
}

    public static void main(String[] args) throws InterruptedException {
        int[] array = new int[1000];
        for(int pos = 0; pos<1000; pos++) {
            array[pos] = 1000-pos;
        }
        long start = System.currentTimeMillis();
        new MergeSort().sort(array);
        long finish = System.currentTimeMillis();

        for(int i = 0; i<array.length; i++) {
            System.out.print(array[i]+" ");
        }
        System.out.println();
        System.out.println(finish-start);

    }
}

Solution

  • There are several factors here. First of all, you are spawning too many threads. A lot more than the number of cores your processor has. If I understand your algorithm correctly you are doing something like log2(n) at the bottom level of your tree.

    Given that you're doing processor intensive computations, not involving I/O, once you pass the number of cores with your thread count, the performance starts degrading pretty fast. Hitting something like several thousand threads will slow and in the end crash the VM.

    If you want to actually benefit from having a multi-core processor in this computation you should try to use a fixed size thread-pool (upper bounded on the number of cores or thereabout) or an equivalent thread reuse policy.

    Second point, if you want to do a valid comparison you should try with computations that last longer (sorting 100 numbers doesn't qualify). If not, you are taking a significant relative hit from the cost of creating threads.