java data-structures stack-overflow quicksort

Quicksort Stack Overflow with Big array sizes and few distinct values

I'm working on implementing a Quicksort algorithm which I've got to work fine with array sizes up to 100,000. Once I try to sort with a size of 1,000,000 I get a stack overflow error (which is happening to my best understanding due to the recursive functionality of my algorithm).
I know this question has been asked on here before, but none of those answers helped me at all. I've looked through my code extensively, and even modeled it off a Java textbook just to double check, still no fix.
I've been reading on here for at least an hour trying to solve this, and I read that the call stack could hold up to 8MB or so depending on the system. I'm wondering if either:

My algorithm is wrong
1,000,000 elements is too large to use for a quicksort (with this compiler).

EDIT: To anyone interested: I discovered that increasing my random interval from 1-9 to 1-n (n being the size of the sequence being sorted, ex: 1-1000000), my quicksort ran extremely faster, and of course didn't have any overflow problems.

I'm going to submit my code now, with the driver, and hopefully someone can quickly show me where I went wrong.

public class QuickSort {

    public static void main(String[] args) {

        //JUST TO TEST THAT IT WORKS
        int[]A = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1}; //worst case

        quickSort(A, 0, A.length-1);

        //print the array
        for (int a :A)
            System.out.print(a+" ");
        System.out.println();

    }

    /**
     * Quicksort algorithm, O(n log n) best and average case, O(n^2) worst case
     * @param S sequence to sort
     * @param a the lower bound of sequence
     * @param b upper bound of sequence
     */
    public static void quickSort(int[]S, int a, int b) {
        if (a >= b)
            return;
        int p = S[b]; //setting pivot to the last element in sequence
        int l = a;
        int r = b - 1;
        int temp;

        while (l <= r) { //once left and right have crossed this will end while
            while (l<= r && S[l] <= p) {
                l++; //move in from left side until found an element greater than the pivot
            }
            while (l <= r && S[r] >= p) {
                r--; //move in from right side until element found less than the pivot
            }
            if (l <= r) {
                //swap S[l] and S[r] //swap the left and right elements if they haven't crossed
                temp = S[l];
                S[l] = S[r];
                S[r] = temp;
                l++;
                r--;
            }
        }
        //left and right have crossed here
        //swap S[l] and S[b] //put the pivot back to the new partition spot
        temp = S[l];
        S[l] = S[b];
        S[b] = temp;
        quickSort(S, a, l-1); //call quicksort on our new sublists partitioned around our pivot
        quickSort(S, l+1, b);
        //recursive calls
    }
}

The Driver:

import java.util.Random;

public class SortingCompare {

    public static void main(String[] args) {
        Random rand = new Random();
        System.out.printf("Sorting Run Times:\n");
        System.out.printf("Array Size   Insertion Sort%5s%s\n", " ","Quick sort");

            int A[] = new int[100000];
            int n = 100000;

            for (int i = 0; i < n; i++) {
                A[i] = rand.nextInt(9) + 1; //1-9
            }
            //array is filled with random integers

            long startQuickSort = System.currentTimeMillis();
            QuickSort.quickSort(A, 0, A.length - 1);
            long quickTime = System.currentTimeMillis() - startQuickSort;

            System.out.printf("%-5d%10dms%15s%dms\n", n, insertionTime, " ", quickTime);
    }
}

Solution

You should sort the smaller of the two partitions first, to minimize stack usage, and use an insertion sort for partitions less than say 16 elements.

You also need to look up the Quicksort algorithm. You don't need two tests in each inner loop: they completely destroy the point of the algorithm; and the details of your implementation vary from the canonical version by Sedgewick.