Search code examples
pythonlistcollectionssetdeque

How to keep ordered unique items in a size limited structure?


I need to save stream of elements in a size limited list. There may be duplicate elements in the stream but I need to just keep the unique ones. Also when the size of my list exceeds a specified limit, I need to remove the oldest element and add the new one.

I already have tried set and list. The problem with set is that it is not size limited and if I want to remove the oldest element I have no idea how to retrieve it because set is unordered; however, it solves the problem of uniqueness.

On the other hand list keeps the order of items, but I need to check for possible duplicates whenever I want to insert a new element and this can cost a lot of time. Also list is not size limited as well as set.

My third option could be collections.deque but I don't know if it keeps the order or not. And is there any way to keep the items in collections.deque unique?

These are examples of my codes for list:

ids = list()
for item in stream:
    if item not in ids:
        ids.append(item)
    if len(ids) >= len_limit:
        del ids[0]

and set:

ids = set()
for item in stream:
    ids.add(item)
    if len(ids) >= len_limit:
        ids.remove(list(ids)[0])

Solution

  • You can write your own class which keeps both deque ans set:

    import collections
    
    
    class Structure:
        def __init__(self, size):
            self.deque = collections.deque(maxlen=size)
            self.set = set()
    
        def append(self, value):
            if value not in self.set:
                if len(self.deque) == self.deque.maxlen:
                    discard = self.deque.popleft()
                    self.set.discard(discard)
                self.deque.append(value)
                self.set.add(value)
    
    s = Structure(2)
    s.append(1)
    s.append(2)
    s.append(3)
    s.append(3)
    print(s.deque)  # deque([2, 3], maxlen=2)