Problems with quicksort in Forth for big sorted arrays

I use Quicksort to sort integers being elements in sets represented by entries on a stack. It works okay except when it has to sort larger (about 10,000 elements) sets which happens to be sorted already.

: adswap \ ad1 ad2 -- 
  over @ over @ swap rot ! swap ! ; 

: singlepart \ ad1 ad2 -- ad
  tuck 2dup @ locals| p ad | swap                \ ad2 ad2 ad1
  do i @ p <                                     \ ad2 flag
     if ad i adswap ad cell + to ad then cell    \ ad2 cell
  +loop ad adswap ad ;                           \ ad 

: qsort \ ad1 ad2 --      pointing on first and last cell in array
  2dup < 
  if 2dup singlepart >r
     swap r@ cell - recurse
     r> cell + swap recurse 
  else 2drop 
  then ;

Could it be overflow in the return stack? It's virtually impossible for the program to keep in track when the array is sorted or not, so how to solve the problem?

Solution

Yes, Quicksort is known to be a subject of the return stack overflow in the edge cases in naive implementation. The solution is also known: use smaller part for recursion and another part for tail call. Oh, this recipe is already described in Wikipedia too:

To make sure at most O(log n) space is used, recurse first into the smaller side of the partition, then use a tail call to recurse into the other.

Tail call optimization transforms a call into jump, so it does not use the return stack.

Updated qsort definition:

: qsort \ ad1 ad2 --      pointing on first and last cell in array
  begin
    2dup < 0= if 2drop exit then
    2dup - negate 1 rshift >r \ keep radius (half of the distance)
    2dup singlepart 2dup - >r >r \ ( R: radius distance2 ad )
    r@ cell - swap r> cell+ swap \ ( d-subarray1 d-subarray2 )
    2r> u< if 2swap then recurse \ take smallest subarray first
  again \ tail call optimization by hand
;