Search code examples
pythonyielddefaultdict

Why doesn't yield return individual values during each loop constructing this defaultdict?


This is the original code:

from collections import defaultdict

lis = [[1, 2], [2, 1], [3, 0], [2, 1], [1, 1]]

res = defaultdict(int)
for i, j in lis:
    res[i] += j
    print(res.items())

result

dict_items([(1, 2)])
dict_items([(1, 2), (2, 1)])
dict_items([(1, 2), (2, 1), (3, 0)])
dict_items([(1, 2), (2, 2), (3, 0)])
dict_items([(1, 3), (2, 2), (3, 0)])

I want to use yield to get these printed items.

from collections import defaultdict

li = [[1, 2], [2, 1], [3, 0], [2, 1], [1, 1]]


def g(lis: list):
    res = defaultdict(int)
    for i, j in lis:
        res[i] += j
        yield res.items()


print(*g(li))

but I get

dict_items([(1, 3), (2, 2), (3, 0)]) dict_items([(1, 3), (2, 2), (3, 0)]) dict_items([(1, 3), (2, 2), (3, 0)]) dict_items([(1, 3), (2, 2), (3, 0)]) dict_items([(1, 3), (2, 2), (3, 0)])

Solution

  • What you say in your own answer is true. I just wanted to make sure you understand that this fact you have uncovered would be equally true of your first code example if you collected each value into a list and then printed that list using a single print statement. yield doesn't have anything to do with the issue you're seeing. I expect that you already know this, but I wanted to point it out in case someone reading this later on might think that this is a problem being introduced by using yield. It is not.

    To see this, you can change your second example to print the yielded value immediately. That way, you're doing the same thing in both examples...printing the next value as soon as it is generated. If you do that, you get the same result for both versions of your code.

    Here's a full set of code to demonstrate this:

    from collections import defaultdict
    
    lis = [[1, 2], [2, 1], [3, 0], [2, 1], [1, 1]]
    
    res = defaultdict(int)
    for i, j in lis:
        res[i] += j
        print(res.items())
    
    def g(lis: list):
        res = defaultdict(int)
        for i, j in lis:
            res[i] += j
            yield res.items()
    
    for v in g(lis):
        # Print the next generated value
        print(v)
    

    Result:

    dict_items([(1, 2)])
    dict_items([(1, 2), (2, 1)])
    dict_items([(1, 2), (2, 1), (3, 0)])
    dict_items([(1, 2), (2, 2), (3, 0)])
    dict_items([(1, 3), (2, 2), (3, 0)])
    dict_items([(1, 2)])
    dict_items([(1, 2), (2, 1)])
    dict_items([(1, 2), (2, 1), (3, 0)])
    dict_items([(1, 2), (2, 2), (3, 0)])
    dict_items([(1, 3), (2, 2), (3, 0)])