Search code examples
pythonswapdoubly-linked-list

Why does one variation of my doubly linked list node swap work while another almost identical variation does not on adjacent nodes?


I'm practicing linked data structures in Python. One problem I'm working on is a method to swap two nodes within a doubly linked list without just moving data. So no node1._data=node2._data or anything like that. I have to mess around the the next and previous pointers. It is a method as part of a doubly linked list class. This method in particular takes in pointers to nodes as parameters. Not values that it has to search for in the list. It's meant to be used in a sort where swapping occurs.

I have two versions. The first works perfectly while the second has issues. The difference is the placement of exactly one line. They are otherwise identical. The working code is as follows:

def _swap(self, node1, node2):
        assert l is not None and r is not None, "nodes to swap cannot be None"
        
        #Handle cases where nodes are front or rear
        if node1==self._front:
            self._front=node2
        elif node2==self._front:
            self._front=node1
            
        if node1==self._rear:
            self._rear=node2
        elif node2==self._rear:
            self._rear=node1


        #Swap pointers
        node1._next, node2._next = node2._next, node1._next
        
        
        if node1._next is not None:
            node1._next._prev=node1
        if node2._next is not None:
            node2._next._prev=node2
            
        node1._prev, node2._prev = node2._prev, node1._prev
        
        if node1._prev is not None:
            node1._prev._next=node1
        if node2._prev is not None:
            node2._prev._next=node2
            
        return

The non-working code is almost exactly identical but fails if the two nodes in question are directly adjacent to each to each other. The difference is the swap pointers section. The swaps for the previous pointers are changed alongside the swaps for the next pointers. Like so

 #Swap pointers
        node1._next, node2._next = node2._next, node1._next
        node1._prev, node2._prev = node2._prev, node1._prev
        
        
        if node1._next is not None:
            node1._next._prev=node1
        if node2._next is not None:
            node2._next._prev=node2
            
        
        if node1._prev is not None:
            node1._prev._next=node1
        if node2._prev is not None:
            node2._prev._next=node2
            
        return

The line is exactly the same but placed a few lines up. Why does this affect swaps of adjacent nodes? Why does it work fine when dealing with non-adjacent nodes? This is a just hunch I'm just thinking of right now but does it have to do with the fact that when the nodes are adjacent and you swap the pointers you end up having things pointing to themselves?

When I first started out trying to tackle this problem I used the second incorrect implementation which caused a headache. After moving that one line around it works just fine but I don't know why it's such a big deal.


Solution

  • Let's visualise this for an example input with four nodes, in which we want to swap the two middle nodes:

                        node1             node2
                         │                 │
    ┌─────────────┐   ┌──┴──────────┐   ┌──┴──────────┐   ┌─────────────┐
    │ _data: 1    │   │ _data: 2    │   │ _data: 3    │   │ _data: 4    │
    │             │   │             │   │             │   │             │
    │ _next: ─────►───┤ _next: ─────►───┤ _next: ─────►───┤ _next: None │
    │             ├───◄─────────┐   ├───◄─────────┐   ├───◄─────────┐   │
    │ _prev: None │   │ _prev: ─┘   │   │ _prev: ─┘   │   │ _prev: ─┘   │
    └─────────────┘   └─────────────┘   └─────────────┘   └─────────────┘
    

    We can ignore the code that deals with head/tail boundary cases, and then the first swap statement is this:

    node1._next, node2._next = node2._next, node1._next
    

    This results in:

                                  ┌──────────────────────────┐ 
                        node1     │       node2              │
                         │        │        │                 │
    ┌─────────────┐   ┌──┴────────▲─┐   ┌──┴──────────┐   ┌──┴──────────┐
    │ _data: 1    │   │ _data: 2  │ │   │ _data: 3    │   │ _data: 4    │
    │             │   │           │ │   │             │   │             │
    │ _next: ─────►───┤ _next: ───┘ │   │ _next: self │   │ _next: None │
    │             ├───◄─────────┐   ├───◄─────────┐   ├───◄─────────┐   │
    │ _prev: None │   │ _prev: ─┘   │   │ _prev: ─┘   │   │ _prev: ─┘   │
    └─────────────┘   └─────────────┘   └─────────────┘   └─────────────┘
    

    ...where notably node2._next === node2 (a self reference).

    And now the two versions of the code go different ways.

    Correct code

    The correct code continues with the following:

            node1._next._prev=node1
            node2._next._prev=node2
    
                                ┌──────────────────────────────────────┐
                                │ ┌──────────────────────────┐         │
                        node1   │ │       node2              │         │
                         │      │ │        │                 │         │
    ┌─────────────┐   ┌──┴──────┴─▲─┐   ┌──┴──────────┐   ┌──┴─────────▲┐
    │ _data: 1    │   │ _data: 2  │ │   │ _data: 3    │   │ _data: 4   ││
    │             │   │           │ │   │             │   │            ││
    │ _next: ─────►───┤ _next: ───┘ │   │ _next: self │   │ _next: None││
    │             ├───◄─────────┐   │   │             │   │            ││
    │ _prev: None │   │ _prev: ─┘   │   │ _prev: self │   │ _prev: ────┘│
    └─────────────┘   └─────────────┘   └─────────────┘   └─────────────┘
    

    ...which creates a second self-reference and makes node2 completely isolated.

    And then:

    node1._prev, node2._prev = node2._prev, node1._prev
    
                                ┌──────────────────────────────────────┐
                                │ ┌──────────────────────────┐         │
                        node1   │ │       node2              │         │
                         │      │ │        │                 │         │
    ┌─────────────┐   ┌──┴──────┴─▲─┐   ┌──┴──────────┐   ┌──┴─────────▲┐
    │ _data: 1    │   │ _data: 2  │ │   │ _data: 3    │   │ _data: 4   ││
    │             │   │           │ │   │             │   │            ││
    │ _next: ─────►───┤ _next: ───┘ │   │ _next: self │   │ _next: None││
    │             │   │             │   │             │   │            ││
    │ _prev: None │   │ _prev: ─────►───┤ _prev: ┐    │   │ _prev: ────┘│
    └──────────┬──┘   └─────────────┘   └────────▼────┘   └─────────────┘
               └─────────────────────────────────┘
    

    ...which finalises the _prev chain.

    And finally:

            node1._prev._next=node1
            node2._prev._next=node2
    

    Will also finalise the _next chain:

                                ┌──────────────────────────────────────┐
                                │ ┌──────────────────────────┐         │
                        node1   │ │       node2              │         │
                         │      │ │        │                 │         │
    ┌─────────────┐   ┌──┴──────┴─▲─┐   ┌──┴──────────┐   ┌──┴─────────▲┐
    │ _data: 1    │   │ _data: 2  │ │   │ _data: 3    │   │ _data: 4   ││
    │             │   │           │ ├───◄────────┐    │   │            ││
    │ _next: ─────►─┐ │ _next: ───┘ │   │ _next: ┘    │   │ _next: None││
    │             │ │ │             │   │             │   │            ││
    │ _prev: None │ │ │ _prev: ─────►───┤ _prev: ┐    │   │ _prev: ────┘│
    └──────────┬──┘ │ └─────────────┘   └──┬─────▼────┘   └─────────────┘
               │    └──────────────────────┘     │
               └─────────────────────────────────┘
    

    All is OK.

    Wrong code

    Here we start again from the state before the code went different ways:

                                  ┌──────────────────────────┐ 
                        node1     │       node2              │
                         │        │        │                 │
    ┌─────────────┐   ┌──┴────────▲─┐   ┌──┴──────────┐   ┌──┴──────────┐
    │ _data: 1    │   │ _data: 2  │ │   │ _data: 3    │   │ _data: 4    │
    │             │   │           │ │   │             │   │             │
    │ _next: ─────►───┤ _next: ───┘ │   │ _next: self │   │ _next: None │
    │             ├───◄─────────┐   ├───◄─────────┐   ├───◄─────────┐   │
    │ _prev: None │   │ _prev: ─┘   │   │ _prev: ─┘   │   │ _prev: ─┘   │
    └─────────────┘   └─────────────┘   └─────────────┘   └─────────────┘
    

    And then execute:

    node1._prev, node2._prev = node2._prev, node1._prev
    

    This results in:

                                  ┌──────────────────────────┐ 
                        node1     │       node2              │
                         │        │        │                 │
    ┌─────────────┐   ┌──┴────────▲─┐   ┌──┴──────────┐   ┌──┴──────────┐
    │ _data: 1    │   │ _data: 2  │ │   │ _data: 3    │   │ _data: 4    │
    │             │   │           │ │   │             │   │             │
    │ _next: ─────►───┤ _next: ───┘ │   │ _next: self │   │ _next: None │
    │             │   │             │   │             ├───◄─────────┐   │
    │ _prev: None │   │ _prev: self │   │ _prev: ┐    │   │ _prev: ─┘   │
    └──────────┬──┘   └─────────────┘   └────────▼────┘   └─────────────┘
               └─────────────────────────────────┘
    

    This time we created a self reference in node1, and this is a problem that the following statements will not resolve:

            node1._next._prev=node1
            node2._next._prev=node2
    
                                ┌──────────────────────────────────────┐
                                │ ┌──────────────────────────┐         │
                        node1   │ │       node2              │         │
                         │      │ │        │                 │         │
    ┌─────────────┐   ┌──┴──────┴─▲─┐   ┌──┴──────────┐   ┌──┴─────────▲┐
    │ _data: 1    │   │ _data: 2  │ │   │ _data: 3    │   │ _data: 4   ││
    │             │   │           │ │   │             │   │            ││
    │ _next: ─────►───┤ _next: ───┘ │   │ _next: self │   │ _next: None││
    │             │   │             │   │             │   │            ││
    │ _prev: None │   │ _prev: self │   │ _prev: self │   │ _prev: ────┘│
    └─────────────┘   └─────────────┘   └─────────────┘   └─────────────┘
    

    Note how the second assignment destroyed a _prev reference that was already correctly set, while node1._prev is not corrected. The wrong _prev attribute was updated! The code that remains to be executed does not affect any _prev attributes... so the damage is permanent.

    I hope this clarifies it.