Search code examples
pythongenerator

Avoid duplicates with the yield keword?


Is it possible to avoid duplicates in a yield generator?

For example

def foo():
    for i in (1,1,1,2,3,4,5):
        #check if "i" have not been "yielded" yet?
        yield i 

gen = foo()
numbers = list(gen)
print(numbers)

>>>[1,2,3,4,5]

numbers = list(gen) is cheating, the goal is to do it within the function


Solution

  • You can define foo as follows

    def foo():
        seen = set()
        for i in (1, 1, 1, 2, 3, 4, 5):
            if i not in seen:
                seen.add(i)
                yield i
    

    Keep in mind that sets are unordered. If order doesn't matter to you, then you can use:

    def foo(lst):
        return (v for v in set(lst))
    

    as pointed out by @TomKarzes in the comments. If you need the result to be in order then you'll have to stick to the initial formulation.