Search code examples
algorithmdata-structureslinked-listnodes

Why isn't deletion O(1) in both Singly Linked Lists and Doubly Linked Lists, when given the node to delete?


It's abundantly clear to me that when we want to delete a node in a Linked List (be it doubly or singly linked), and we have to search for this node, the time complexity for this task is O(n), as we must traverse the whole list in the worst case to identify the node. Similarly, it is O(k) if we want to delete the k-th node, and we don't have a reference to this node already.

It is commonly cited that one of the benefits of using a doubly linked list over a singly linked list is that deletion is O(1) when we have a reference to the node we want to delete. I.e., if you want to delete the Node i, simply do the following: i.prev.next = i.next and i.next.prev = i.prev

It is said that deletion is O(1) in a singly linked list ONLY if you have a reference to the node prior to the one you want to delete. However, I don't think this is necessarily the case. If you want to delete Node i (and you have a reference to Node i), why can't you just copy over the data from i.next, and set i.next = i.next.next? This would also be O(1), as in the doubly linked list case, meaning that deletion is no more efficient in a doubly linked list in ANY case, as far as Big-O is concerned. Of course, this idea wouldn't work if the node you're trying to delete is the last node in the linked list.

It's really bugging me that no one remembers this when comparing singly and doubly linked lists. What am I missing?

To clarify: what I'm suggesting in the singly linked case is overwriting the data at the Node you want to delete, with the data from the next node, and then deleting the next node. This has the same desired effect as deleting Node i, though it is not what you're doing per se.

EDIT

What I've Learned:

So it seems that I am correct to some extent. First of all, many people mentioned that my solution isn't complete, as deletion of the last element is a problem, so my algorithm is O(n) (by definition of Big-O). I naively suggested in response to get around this by keeping track of the "2nd to last node" in your list - of course, this causes problems once the last node in your list has been deleted the first time. A solution that was suggested, and does seem to work, is to demarcate the end of your list with something like a NullNode, and I like this approach.

Other problems that were presented were referential integrity, and the time associated with copying the data itself from the next node (i.e. presumably, a costly deep copy might be necessary). If you can assume that you don't have other objects using the node that you're copying, and that the task of copying is O(1) in itself, then it seems like my solution works. Although, at this point, maybe its worth it to just use a Doubly Linked List :)


Solution

  • It is true that copying data from i.next to i and then deleting i would be O(1) assuming that copying the data is also O(1).

    But even with this algorithm, since deleting the last element is O(n), and a description of a function in terms of big O notation only provides an upper bound on the growth rate of the function, that means your algorithm is still O(n).

    Regarding your comment:

    I guess my dissatisfaction comes from the fact that textbooks and basically every resource online cites the #1 biggest advantage of doubly linked lists is deletion - this seems a little disingenuous. It's a very specific case of deletion - deletion at the tail! If efficient deletion is all you're going for, seems like this doesn't warrant using a doubly instead of a singly linked list (due to all the overhead necessary of having double the number of pointers). Simply store a reference to the second to last node in your list, and you're good to go!

    You can certainly store a reference to the second to last node and make deletion of the last node O(1), but this would be the case only for the first time you delete the last node. You could update the reference to the node before it, but finding it will be O(n). You can solve this if you keep a reference to second to last element, and so on. At this point, you have reasoned your way to a doubly linked list, whose main advantage is deletion, and since you already have pointers to previous nodes you don't really need to move values around.

    Remember that big O notation talks about the worst case scenario, so if even a single case is O(n) then your entire algorithm is O(n).

    When you say a solution is O(n) you are basically saying "in the worst possible case, this algorithm will grow as fast as n grows".

    Big O does not talk about expected or average performance, and it's a great theoretical tool, but you need to consider your particular use cases when deciding what to use.

    Additionally, if you need to preserve reference integrity, you would not want to move values from one node to the other, i.e. if you tad a reference to node i+1 and delete node i, you wouldn't expect your reference to be silently invalid, so when removing elements the more robust option is to delete the node itself.