I am trying to understand the behaviour of the yield statement by building a generator which behaves similarly to the 'enumerate' built-in function but I am witnessing inconsistencies depending on how I iterate through it.
def enumerate(sequence, start=0):
n = start
for elem in sequence:
print("Before the 'yield' statement in the generator, n = {}".format(n))
yield n, elem
n += 1
print("After the 'yield' statement in the generator, n = {}".format(n))
My understanding of generators is that the execution of the code will stop once a yield statement has been reached, upon which it returns a value. This matches what I get with the script below.
a = 'foo'
b = enumerate(a)
n1,v1 = next(b)
print('n1 = {}, v1 = {}\n'.format(n1,v1))
n2,v2 = next(b)
print('n2 = {}, v2 = {}'.format(n2,v2))
In this case, the generator seems to stop exactly at the yield statement and resumes in the n+=1 one with the second 'next' statement:
Before the 'yield' statement in the generator, n = 0
n1 = 0, v1 = f
After the 'yield' statement in the generator, n = 1
Before the 'yield' statement in the generator, n = 1
n2 = 1, v2 = o
However, if I use the for loop below, the generator does not seem to stop at the yield statement.
for n,v in enumerate(a[0:1]):
print('n = {}, v = {}'.format(n,v))
This is what I get:
Before the 'yield' statement in the generator, n = 0
n = 0, v = f
After the 'yield' statement in the generator, n = 1
Edit taking comments into account
I realise I'm iterating over just one element, but I was not expecting to see the very last "After the 'yield' statement in the generator" sentence (which appears even if I iterate over ALL the elements.
print('\n\n')
for n,v in enumerate(a):
print('n = {}, v = {}'.format(n,v))
Before the 'yield' statement in the generator, n = 0
n = 0, v = f
After the 'yield' statement in the generator, n = 1
Before the 'yield' statement in the generator, n = 1
n = 1, v = o
After the 'yield' statement in the generator, n = 2
Before the 'yield' statement in the generator, n = 2
n = 2, v = o
After the 'yield' statement in the generator, n = 3
Why does this happen?
The fundamental issue here is that you are confusing the fact that you know when the generator will be exhausted just by looking at it, with the fact that Python can only know by running the code. When Python reaches the yield
that you consider to be the last one, it does not actually know that it is the last one. What if your generator looked like this:
def enumeratex(x, start=0):
for elem in x:
yield start, x
start += 1
yield start, None
Here, for reasons no one will ever know, a final None
element is returned after the main generator loop. Python would have no way of knowing that the generator is done until you either
In versions before Python 3.7, generators could raise StopIteration
to indicate termination. In fact, a return statement would be equivalent to either raise StopIteration
(if returning None
) or raise StopIteration(return_value)
.
So while the exact manner in which you tell Python to end the generator is up to you, you do have to be explicit about it. A yield
does not by itself end the generator.
TL;DR
All of the code in a loop in a generator will always run, even after the last value has been yielded because Python can only know it was the last value by actually executing all the code.