I am trying to understand the ShellSort code in K&R book at page 62. But there is one part i am not sure of.
So here is the original code from the book:
void shellsort(int* v, int n) {
int gap, i, j, temp;
for (gap = n / 2; gap > 0; gap /= 2) {
for (i = gap; i < n; i++) {
for (j = i - gap; j >= 0 && v[j] > v[j + gap]; j -= gap) {
temp = v[j];
v[j] = v[j + gap];
v[j + gap] = temp;
}
}
}
}
And i am trying to understand why there is third loop. It could be only if couldnt it?
Here is changed version of code (the one that i think could work as well):
void shellsort(int* v, int n) {
int gap, i, j, temp;
for (gap = n / 2; gap > 0; gap /= 2) {
for (i = gap; i < n; i++) {
j = i - gap;
if (v[j] > v[j + gap]) {
temp = v[j];
v[j] = v[j + gap];
v[j + gap] = temp;
}
}
}
}
And when i run the code, and it outputs the same thing as the first code:
Output:
12345679
But surely there is some reason using for
there. And i am not able to find what reason that is. So i thought someone can clear this out?
You might get a better feeling for what's going on if you trace what the algorithm does. Here is a version of your program with some extra print statements:
void shellsort(int* v, int n) {
int gap, i, j, temp;
for (gap = n / 2; gap > 0; gap /= 2) {
printf("enter outer loop with gap = %d\n", gap);
for (i = gap; i < n; i++) {
printf("- enter second loop with i = %d\n", i);
for (j = i - gap; j >= 0 && v[j] > v[j + gap]; j -= gap) {
temp = v[j];
v[j] = v[j + gap];
v[j + gap] = temp;
}
printf("- after innermost loop:");
print_array(v, n);
}
}
}
(I omitted the definition of print_array
.)
Calling this with an array { 5, 4, 3, 2, 1 }
, as a commenter suggested, gives this output:
5 4 3 2 1
enter outer loop with gap = 2
- enter second loop with i = 2
- after innermost loop: 3 4 5 2 1
- enter second loop with i = 3
- after innermost loop: 3 2 5 4 1
- enter second loop with i = 4
- after innermost loop: 1 2 3 4 5
enter outer loop with gap = 1
- enter second loop with i = 1
- after innermost loop: 1 2 3 4 5
- enter second loop with i = 2
- after innermost loop: 1 2 3 4 5
- enter second loop with i = 3
- after innermost loop: 1 2 3 4 5
- enter second loop with i = 4
- after innermost loop: 1 2 3 4 5
1 2 3 4 5
But here is what happens if I use your code, using just an if
instead of the innermost for
loop:
5 4 3 2 1
enter outer loop with gap = 2
- enter second loop with i = 2
- after swap: 3 4 5 2 1
- enter second loop with i = 3
- after swap: 3 2 5 4 1
- enter second loop with i = 4
- after swap: 3 2 1 4 5
enter outer loop with gap = 1
- enter second loop with i = 1
- after swap: 2 3 1 4 5
- enter second loop with i = 2
- after swap: 2 1 3 4 5
- enter second loop with i = 3
- after swap: 2 1 3 4 5
- enter second loop with i = 4
- after swap: 2 1 3 4 5
2 1 3 4 5
The result is incorrect because the 1
is not propagated to the beginning of the array. This is due to the missing inner loop. In the original version, at gap = 2
and i = 4
, the program compares 5
and 1
and swaps them; then compares 3
and 1
and swaps them as well to ensure that these three elements (1
, 3
, 5
) are in the correct relative order. Without the inner loop, this second swap is not done. There would be a chance to repair this in the iteration with gap = 1
, but again 1
is only swapped with one element (3
) but not swapped with 2
.
Or, for a shorter but more obscure answer: Shell sort performs a loop for various "gap sizes" over a variant of insertion sort. If you know insertion sort, you know that it consists of two nested loops. If you remove the innermost loop, you break the inner insertion sort.
Finally, in your example that just worked, you simply got unlucky: If the input is (mostly) sorted, even broken sorting algorithms can appear to work. These things are hard to test.