I'm practicing linked data structures in Python. One problem I'm working on is a method to swap two nodes within a doubly linked list without just moving data. So no node1._data=node2._data or anything like that. I have to mess around the the next and previous pointers. It is a method as part of a doubly linked list class. This method in particular takes in pointers to nodes as parameters. Not values that it has to search for in the list. It's meant to be used in a sort where swapping occurs.
I have two versions. The first works perfectly while the second has issues. The difference is the placement of exactly one line. They are otherwise identical. The working code is as follows:
def _swap(self, node1, node2):
assert l is not None and r is not None, "nodes to swap cannot be None"
#Handle cases where nodes are front or rear
if node1==self._front:
self._front=node2
elif node2==self._front:
self._front=node1
if node1==self._rear:
self._rear=node2
elif node2==self._rear:
self._rear=node1
#Swap pointers
node1._next, node2._next = node2._next, node1._next
if node1._next is not None:
node1._next._prev=node1
if node2._next is not None:
node2._next._prev=node2
node1._prev, node2._prev = node2._prev, node1._prev
if node1._prev is not None:
node1._prev._next=node1
if node2._prev is not None:
node2._prev._next=node2
return
The non-working code is almost exactly identical but fails if the two nodes in question are directly adjacent to each to each other. The difference is the swap pointers section. The swaps for the previous pointers are changed alongside the swaps for the next pointers. Like so
#Swap pointers
node1._next, node2._next = node2._next, node1._next
node1._prev, node2._prev = node2._prev, node1._prev
if node1._next is not None:
node1._next._prev=node1
if node2._next is not None:
node2._next._prev=node2
if node1._prev is not None:
node1._prev._next=node1
if node2._prev is not None:
node2._prev._next=node2
return
The line is exactly the same but placed a few lines up. Why does this affect swaps of adjacent nodes? Why does it work fine when dealing with non-adjacent nodes? This is a just hunch I'm just thinking of right now but does it have to do with the fact that when the nodes are adjacent and you swap the pointers you end up having things pointing to themselves?
When I first started out trying to tackle this problem I used the second incorrect implementation which caused a headache. After moving that one line around it works just fine but I don't know why it's such a big deal.
Let's visualise this for an example input with four nodes, in which we want to swap the two middle nodes:
node1 node2
│ │
┌─────────────┐ ┌──┴──────────┐ ┌──┴──────────┐ ┌─────────────┐
│ _data: 1 │ │ _data: 2 │ │ _data: 3 │ │ _data: 4 │
│ │ │ │ │ │ │ │
│ _next: ─────►───┤ _next: ─────►───┤ _next: ─────►───┤ _next: None │
│ ├───◄─────────┐ ├───◄─────────┐ ├───◄─────────┐ │
│ _prev: None │ │ _prev: ─┘ │ │ _prev: ─┘ │ │ _prev: ─┘ │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
We can ignore the code that deals with head/tail boundary cases, and then the first swap statement is this:
node1._next, node2._next = node2._next, node1._next
This results in:
┌──────────────────────────┐
node1 │ node2 │
│ │ │ │
┌─────────────┐ ┌──┴────────▲─┐ ┌──┴──────────┐ ┌──┴──────────┐
│ _data: 1 │ │ _data: 2 │ │ │ _data: 3 │ │ _data: 4 │
│ │ │ │ │ │ │ │ │
│ _next: ─────►───┤ _next: ───┘ │ │ _next: self │ │ _next: None │
│ ├───◄─────────┐ ├───◄─────────┐ ├───◄─────────┐ │
│ _prev: None │ │ _prev: ─┘ │ │ _prev: ─┘ │ │ _prev: ─┘ │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
...where notably node2._next === node2
(a self reference).
And now the two versions of the code go different ways.
The correct code continues with the following:
node1._next._prev=node1
node2._next._prev=node2
┌──────────────────────────────────────┐
│ ┌──────────────────────────┐ │
node1 │ │ node2 │ │
│ │ │ │ │ │
┌─────────────┐ ┌──┴──────┴─▲─┐ ┌──┴──────────┐ ┌──┴─────────▲┐
│ _data: 1 │ │ _data: 2 │ │ │ _data: 3 │ │ _data: 4 ││
│ │ │ │ │ │ │ │ ││
│ _next: ─────►───┤ _next: ───┘ │ │ _next: self │ │ _next: None││
│ ├───◄─────────┐ │ │ │ │ ││
│ _prev: None │ │ _prev: ─┘ │ │ _prev: self │ │ _prev: ────┘│
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
...which creates a second self-reference and makes node2
completely isolated.
And then:
node1._prev, node2._prev = node2._prev, node1._prev
┌──────────────────────────────────────┐
│ ┌──────────────────────────┐ │
node1 │ │ node2 │ │
│ │ │ │ │ │
┌─────────────┐ ┌──┴──────┴─▲─┐ ┌──┴──────────┐ ┌──┴─────────▲┐
│ _data: 1 │ │ _data: 2 │ │ │ _data: 3 │ │ _data: 4 ││
│ │ │ │ │ │ │ │ ││
│ _next: ─────►───┤ _next: ───┘ │ │ _next: self │ │ _next: None││
│ │ │ │ │ │ │ ││
│ _prev: None │ │ _prev: ─────►───┤ _prev: ┐ │ │ _prev: ────┘│
└──────────┬──┘ └─────────────┘ └────────▼────┘ └─────────────┘
└─────────────────────────────────┘
...which finalises the _prev
chain.
And finally:
node1._prev._next=node1
node2._prev._next=node2
Will also finalise the _next
chain:
┌──────────────────────────────────────┐
│ ┌──────────────────────────┐ │
node1 │ │ node2 │ │
│ │ │ │ │ │
┌─────────────┐ ┌──┴──────┴─▲─┐ ┌──┴──────────┐ ┌──┴─────────▲┐
│ _data: 1 │ │ _data: 2 │ │ │ _data: 3 │ │ _data: 4 ││
│ │ │ │ ├───◄────────┐ │ │ ││
│ _next: ─────►─┐ │ _next: ───┘ │ │ _next: ┘ │ │ _next: None││
│ │ │ │ │ │ │ │ ││
│ _prev: None │ │ │ _prev: ─────►───┤ _prev: ┐ │ │ _prev: ────┘│
└──────────┬──┘ │ └─────────────┘ └──┬─────▼────┘ └─────────────┘
│ └──────────────────────┘ │
└─────────────────────────────────┘
All is OK.
Here we start again from the state before the code went different ways:
┌──────────────────────────┐
node1 │ node2 │
│ │ │ │
┌─────────────┐ ┌──┴────────▲─┐ ┌──┴──────────┐ ┌──┴──────────┐
│ _data: 1 │ │ _data: 2 │ │ │ _data: 3 │ │ _data: 4 │
│ │ │ │ │ │ │ │ │
│ _next: ─────►───┤ _next: ───┘ │ │ _next: self │ │ _next: None │
│ ├───◄─────────┐ ├───◄─────────┐ ├───◄─────────┐ │
│ _prev: None │ │ _prev: ─┘ │ │ _prev: ─┘ │ │ _prev: ─┘ │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
And then execute:
node1._prev, node2._prev = node2._prev, node1._prev
This results in:
┌──────────────────────────┐
node1 │ node2 │
│ │ │ │
┌─────────────┐ ┌──┴────────▲─┐ ┌──┴──────────┐ ┌──┴──────────┐
│ _data: 1 │ │ _data: 2 │ │ │ _data: 3 │ │ _data: 4 │
│ │ │ │ │ │ │ │ │
│ _next: ─────►───┤ _next: ───┘ │ │ _next: self │ │ _next: None │
│ │ │ │ │ ├───◄─────────┐ │
│ _prev: None │ │ _prev: self │ │ _prev: ┐ │ │ _prev: ─┘ │
└──────────┬──┘ └─────────────┘ └────────▼────┘ └─────────────┘
└─────────────────────────────────┘
This time we created a self reference in node1
, and this is a problem that the following statements will not resolve:
node1._next._prev=node1
node2._next._prev=node2
┌──────────────────────────────────────┐
│ ┌──────────────────────────┐ │
node1 │ │ node2 │ │
│ │ │ │ │ │
┌─────────────┐ ┌──┴──────┴─▲─┐ ┌──┴──────────┐ ┌──┴─────────▲┐
│ _data: 1 │ │ _data: 2 │ │ │ _data: 3 │ │ _data: 4 ││
│ │ │ │ │ │ │ │ ││
│ _next: ─────►───┤ _next: ───┘ │ │ _next: self │ │ _next: None││
│ │ │ │ │ │ │ ││
│ _prev: None │ │ _prev: self │ │ _prev: self │ │ _prev: ────┘│
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
Note how the second assignment destroyed a _prev
reference that was already correctly set, while node1._prev
is not corrected. The wrong _prev
attribute was updated! The code that remains to be executed does not affect any _prev
attributes... so the damage is permanent.
I hope this clarifies it.