Search code examples
javascriptnode.jsalgorithmsortingquicksort

QuickSort algorithm failed with stackoverflow error


I tried to implement QuickSort algorithm from Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein - Introduction to Algorithms, Third Edition - 2009 * Section 7.1

in JavaScript.

So here is my implementation:

function swap(arr, i, j) {
    const temp = arr[i];
    arr[i] = arr[j];
    arr[j] = temp;
}

function partition(arr, left = 0, right = arr.length - 1) {
    let pivot = arr[right];
    let i = left - 1;
    let j;

    for (j = left; j < right; j++) {
        if (arr[j] < pivot) {
            i++;
            swap(arr, i, j);
        }
    }

    swap(arr, i + 1, right);
    return i + 1;
}

function quickSort(arr, left = 0, right = arr.length - 1) {
    if (left >= right) return arr;
    const p = partition(arr, left, right);
    quickSort(arr, left, p - 1);
    quickSort(arr, p + 1, right);
    return arr;
}

The problem with this code that it fails with stackoverflow error IF i pass already sorted array with length > 10 000

In case i pass array with fully random integers - everything working like expected even with 50 000 elements. So i can't understand is it problem with my code, or my node just can't handle such big call stack for worst-case quick sort usage.


Solution

  • With large enough arrays, you will eventually reach the maximum stack size and get that exception. The reason you're seeing it on sorted arrays earlier is because of the way you're choosing your pivot.

    You've implemented it with your pivot as the last element of the array, which happens to mean that your worst case scenario occurs when you're given a sorted array. Your pivot value is always the largest value (or smallest if sorted in the opposite direction), and thus every element is less than the pivot, with none greater. So rather than splitting the array into two roughly equal subarrays, you split it into a single sub array that has only one fewer element than you started with.

    One way to choose the pivot to avoid this is to pick the pivot randomly. This makes it unlikely (but not impossible) to hit the worst case, and so on average will work well. It does have a vulnerability in that if someone knows the random number algorithm and seed that you're using, then they can craft an array which will force you into the worst case.

    The ideal pivot is the median value. So one approach is to try to find the median or close to it. For example, you can take a sample of 3 random elements of the array, and use the median of those 3 as your pivot. This is unlikely to be exactly the median but is good enough most of the time. You could sample more to get a better approximation of the median, but then you're spending more time trying to find the pivot than just moving forward with the algorithm; it's a bit of a tradeoff.