Search code examples
pythoncollectionslinked-list

What collection to use that is able to track and remove an object from it in constant time in python?


I need a collection of elements the only purpose of which is to store elements for futher enumeration, so the elements don't have to be ordered/indexes in any way. But a simple list is not enough, because I may need to remove any element from the collection, and I need to do it in constant time. deque or other implementation of linked list as data structures would be sufficient for this purpose, because deletion of an element in linked list is a constant time operation. The problem is that python does not provide a way to track a specific node of a linked list (at least I don't know how), so if I need to remove previously added element from a list, first I need to find it, which is linear time operation.

In C++, there is a std::list with its std::list::iterator that does exactly what I need. The iterator holds the pointer to the node in the linked list. When I add an element to a linked list, I can store the iterator of the just added element and further use it to remove the element from the list in constant time, since I already have a pointer to the linked list node with the pointer to previous and next elements. It also doesn't matter whether and how many elements were added or deleted before or after the element.

So how can it be done in python? Is there some alternative to the C++'s std::list::iterator?

I also want to notice that I can implement a linked list in python myself, but I'm afraid it would be slower than the implementation written in C. And I also don't want to reinvent the wheel.


Solution

  • Two ideas, showcasing them as in your answer.

    Solution 1: collections.OrderedDict

    Your values are the dict values and increasing numbers are the dict keys.

    from collections import OrderedDict
    
    class Object:
        def __init__(self, i):
            self.i = i
            self.node = None
    
    objs = OrderedDict()
    objs_last = 0
    
    # Initialize the collection
    for i in range(10):
        obj = Object(i)
        
        # For example, store fifth element to further remove it
        if i == 5:
            obj5 = obj
    
        objs_last += 1
        objs[objs_last] = obj
    
        # Store the node in the collection containing the object. Used to further remove the object
        obj.node = objs_last
    
    print([obj.i for obj in objs.values()])
    # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    
    # Delete the fifth element from the collection using stored reference to containing node
    del objs[obj5.node]
    
    print([obj.i for obj in objs.values()])
    # [0, 1, 2, 3, 4, 6, 7, 8, 9]
    

    Solution 2: list

    Store the element's list index. To delete an element, instead replace it with the last element in the list (and update that element's stored index).

    class Object:
        def __init__(self, i):
            self.i = i
            self.node = None
    
    objs = []
    
    # Initialize the collection
    for i in range(10):
        obj = Object(i)
        
        # For example, store fifth element to further remove it
        if i == 5:
            obj5 = obj
    
        objs.append(obj)
    
        # Store the node in the collection containing the object. Used to further remove the object
        obj.node = len(objs) - 1
    
    print([obj.i for obj in objs])
    # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    
    # Delete the fifth element from the collection using stored reference to containing node
    objs[obj5.node] = objs[-1]
    objs.pop().node = obj5.node
    
    print([obj.i for obj in objs])
    # [0, 1, 2, 3, 4, 9, 6, 7, 8]
    

    You could do obj.node = len(objs) before the objs.append(obj), then you don't need the - 1. I just tried to stay as close as possible to your answer.