Search code examples
cinsertion-sortknuthtaocp

Knuth List Insertion Method in C


I've been reading through the sorting and searching algorithms in Volume 3 of The Art of Computer Programming by Donald Knuth, Second Edition. I came across an algorithm which Knuth calls "list insertion" (A modification on traditional insertion sort) on page 95.

On that page, Knuth concludes that the "right data structure for straight insertion is a one-way, linked linear list.", and that "linked allocation (Section 2.2.3) is ideally suited to insertion, since only a few links need to be changed." However, the MIXAL program (Program L) on page 97 does not appear to utilize a traditional linked linear list structure (a series of nodes linked by addresses). Instead, it appears that the key and the link are stored together in a struct-like structure and these structures are stored in an array called INPUT.

I decided to try to implement this algorithm in C. I have provided Knuth's description of the algorithm, and his implementation in the MIXAL assembly language as reference. I decided to let the keys be the elements, themselves, in the data array, and I put the links in a parallel-like array called links. I say 'parallel-like array' because the size of the links array is one greater than the size of the data array. I did this so I could easily determine the index of the smallest element of the data array by storing it as the first element in the links array. Because of this extra index in links, the indices 0 - (n - 1) of the data array correspond to indices 1 - n of the links array. Each element in the links array corresponds to the index in the data array of the next element in the sorted list.

My question is, is this how this algorithm is supposed to be implemented based on his description, or am I missing something?

list insertion description

list insertion MIXAL implementation

int *listInsertion(int data[], int n) {

    if (n > 1) {
        int i, j, entry;
        int *links = (int *) calloc(n + 1, sizeof *links);
        links[n] = -1;
        links[n - 1] = n - 1;

        for (i = n - 2; i >= 0; i--) {
            entry = data[i];
            for (j = i + 1; links[j] >= 0 && entry > data[links[j]]; 
                    j = links[j] + 1)
                continue;

            if (j == i + 1) {
                links[i] = i;
            } else {
                links[i] = links[i + 1];
                links[i + 1] = links[j];
                links[j] = i;
            }
        }
        return links;
    }
    return NULL;
}

Solution

  • I suggest you implement the algorithm with the symbol mentioned by Knuth first.
    This will help you figure out the first version quickly.

    void insertSort(const int *K, int *L, int n) {
      if (n == 1) return;
    
      L[n] = n-1; 
      L[n-1] = n;
    
      for (int j = n-2; j >= 0; j--) {
        int entry = K[j];
        int p = L[n];
        int q = n;
    
        while(entry > K[p]) {
          q = p;
          p = L[q];
          if (p == n) {
            break;
          }
        }
    
        L[q] = j;
        L[j] = p;
      }
    }
    

    Then you can refactor your first version to enhance it or make it shorter.