I'm studying the Shell sort from Sedgewick Algorithms in C part 1-4 on p172.
I use size
(the length of array), not l
and r
(start and end);
so my code is
int i,j,h;
int key;
for( h=1;h<=(size-1)/9;h=h*3+1);
for(;h>0;h/=3)
{
for(i=h;i<size;i++)
{
key=num[i];
j=i;
while(j>=h&&key>num[j-h];j-=h)
{
num[j]=num[j-h];
}
num[j]=key;
}
}
I know all of this. I read TAOCP. I know 1, 4, 13, … is the best sequence (comparable). But in this position, my code has
for(h=1;h<size;h=h*3+1);
My question is: why did he write h<(size-1)/9
?
What does the '/9
' mean?
The loop:
for (h = 1; h < size; h = h * 3 + 1)
;
overshoots the size of the array most of the time. The alternative loop keeps the gap within range.
You can see this for yourself with a trivial test program like this:
#include <stdio.h>
static inline int hs_gap9(int size)
{
int h;
for (h = 1; h <= (size - 1) / 9; h = h * 3 + 1)
;
return h;
}
static inline int hs_gap3(int size)
{
int h;
for (h = 1; h < size; h = h * 3 + 1)
;
return h;
}
int main(void)
{
for (int i = 1; i < 100; i++)
printf("Size: %3d; gap9 = %d; gap3 = %d\n", i, hs_gap9(i), hs_gap3(i));
return 0;
}
Sample output:
Size: 1; gap9 = 1; gap3 = 1
Size: 2; gap9 = 1; gap3 = 4
Size: 3; gap9 = 1; gap3 = 4
Size: 4; gap9 = 1; gap3 = 4
Size: 5; gap9 = 1; gap3 = 13
Size: 6; gap9 = 1; gap3 = 13
Size: 7; gap9 = 1; gap3 = 13
Size: 8; gap9 = 1; gap3 = 13
Size: 9; gap9 = 1; gap3 = 13
Size: 10; gap9 = 4; gap3 = 13
Size: 11; gap9 = 4; gap3 = 13
Size: 12; gap9 = 4; gap3 = 13
Size: 13; gap9 = 4; gap3 = 13
Size: 14; gap9 = 4; gap3 = 40
Size: 15; gap9 = 4; gap3 = 40
Size: 16; gap9 = 4; gap3 = 40
…
Size: 34; gap9 = 4; gap3 = 40
Size: 35; gap9 = 4; gap3 = 40
Size: 36; gap9 = 4; gap3 = 40
Size: 37; gap9 = 13; gap3 = 40
Size: 38; gap9 = 13; gap3 = 40
Size: 39; gap9 = 13; gap3 = 40
Size: 40; gap9 = 13; gap3 = 40
Size: 41; gap9 = 13; gap3 = 121
Size: 42; gap9 = 13; gap3 = 121
Size: 43; gap9 = 13; gap3 = 121
…
Size: 97; gap9 = 13; gap3 = 121
Size: 98; gap9 = 13; gap3 = 121
Size: 99; gap9 = 13; gap3 = 121
As you can see, the 'gap3' algorithm returns an initial value of h
that is larger than the size of the array. The 'gap9' algorithm returns an initial value of h
that is smaller than the size of the array. This saves a little overhead on the loops (saves one iteration of the outer loop, where the middle loop exits on the first cycle without touching the inner loop.