Search code examples
c++sortingpointerspass-by-referencebubble-sort

What is wrong with this implementation of bubble sort using linked lists?


So this is the thing, I need to implement the bubble sort algorithm on linked lists. Yes, I saw many answers written in Stack Overflow that addresses this question. Almost all of them are implemented using pointers-to-pointers. Here, I tried to implement this using a referenced pointer rather than double pointers. I've spent more than a day to figure out what's wrong with my code.

My implementation of the bubble-sort and swap functions are as follows:

struct node
{
    int data;
    node* next;
    node(int x)
    {
        data = x;
        next = NULL;
    }
};
node* swap(node *a, node *b){
    node* temp = b->next;
    b->next = a;
    a->next = temp;
    return b;
}
void bubbleSort(node*& head) {
    node* ptr = head;
    node* last = NULL;
    int swapped;
    do
    {
        swapped = 0;
        ptr = head;

        while (ptr->next != last)
        {
            if (ptr->data > ptr->next->data)
            {
                if (ptr == head)
                {
                    head = swap(ptr, ptr->next);
                    swapped = 1;
                }

                else{
                    ptr = swap(ptr, ptr->next);
                    swapped = 1;
                }
                
            }

            ptr = ptr->next;
            
        }
        
    } while (swapped);
    
}

The output for the above code (which is wrong) is as follows:

The original list = 
2 -> 1 -> 4 -> 3 -> 6 -> 5 -> 

The Sorted list = 
1 -> 2 -> 4 -> 6 -> 

I know this is some basics for most of you. But please I would be very grateful if you can take some of your time to see what's wrong with this code.

Isn't the fact that I've used referenced pointers fulfill the same things that will be done by double-pointers. For example, this bubble-sort implementation is done using double-pointers?


Solution

  • Swapping is a bit special for lists. Swapping 2 consecutive nodes involves touching 4 nodes, the 2 swapped nodes, and the nodes preceding them. That's why the length of the list output from your sort() was not the same length as its input.

    Taking this in consideration, the swap operation then becomes:

    void swap_nodes(Node* a_pred, Node*& a, Node* b_pred, Node*& b)
    {
        assert(a != nullptr && b != nullptr && b_pred != nullptr);
        assert(!a_pred || a_pred->next == a);
        assert(b_pred->next == b); 
    
        if (a == b)
            return;
    
        b_pred->next = a;
        if (a_pred)
            a_pred->next = b;
    
        Node* t = a->next;
        a->next = b->next;
        b->next = t;
    
        t = a;
        a = b;
        b = t;
    }
    

    Also, you cannot escape early by testing if no swap has occured in the inner loop. This test only tells you that the current outer loop node is the smallest until the end, not that the range is fully sorted.

    Keeping track of the nodes preceding the ones being tested is imprtant, since not doing so would mean having to find these predecessors by looping through the list again.

    The sort function then becomes:

    void sort(Node*& head)
    {
        for (Node* a = head, *a_pred = nullptr; a != nullptr; a_pred = a, a = a->next)
        {
            for (Node* b = a->next, *b_pred = a; b != nullptr; b_pred = b, b = b->next)
            {
                if (b->data < a->data)
                {
                   swap_nodes(a_pred, a, b_pred, b);
                   if (a_pred == nullptr)
                       head = a;              
                } 
            }
        }
    }
    

    If you want to optimize a bit, you cannot reduce the amount of comparisons, but you can reduce the amount of swaps, you can try this:

    void sort(Node*& head)
    {
        for (Node* a = head, *a_pred = nullptr; a != nullptr; a_pred = a, a = a->next)
        {
            // find smallest node value in [a, end)
            Node* small_pred = a_pred;
            Node* small = a;
            for (Node* b = a->next, *b_pred = a; b != nullptr; b_pred = b, b = b->next)
            {
                if (b->data < small->data)
                {
                   small_pred = b_pred;
                   small = b;
                }
            }
    
            if (small != a)
            {
               // remove smallest from list.
               small_pred->next = small->next;
    
               // insert smallest before a.
               small->next = a;
               if (a_pred == nullptr)       // same as a == head
                   head = small;
               else 
                   a_pred->next = small;
    
               // keep outer loop happy, by making it look like a swap.
               a = small;
            } 
            // node a is now the smallest node in range [a, end)
            // move on to the next.
        }
    }
    

    [EDIT]

    The bubble sort above is the one used in the industry. The "pedantic" bubble sort, which for some unknown reason is still being taught is here:

    void pedantic_bubble_sort(Node*& head) {
      for (Node* sentinel = nullptr; sentinel != head;) {
        bool swaps = false;
    
        for (Node *pred = nullptr, *a = head, *b = a->next; a && b;
             pred = a, a = b, b = b->next) {
          if (b->data < a->data) {
            swap_nodes(pred, a, a, b);
            if (!pred) head = a;
            swaps = true;
          }
          if (b->next == sentinel) {
            sentinel = b;
            break;
          }
        }
    
        if (!swaps) break;
      }
    }
    

    imho, this algorithm is cleverly slow.

    You can tinker with it here, and compare performance with other algorithms and std::list::sort(): https://godbolt.org/z/Yco8WY