In todays programming world for multicore, multithreaded CPUs (the one in my notebook has two cores with two threads per core) it makes more and more sense to write code able to utilize provided hardware features. Languages like go(lang) are born in order to make it easier for a programmer to speed up applications by spawning multiple 'independent' processes to synchronize them again later on.
In this context getting in touch with generator functions in Python I have expected that such functions will use the idle time passing between subsequent item requests to prepare the next yield for immediate delivery, but it seems not to be that way - at least so my interpretation of the results I have got from running the below provided code.
What confused me even more is that the caller of the generator function must wait until the function finishes to process all the remaining instructions even if the generator has already delivered all of the items.
Are there any clear reasons I can't currently see, why a generator function doesn't in the idle time between yield requests run the code past the requested yield until it meets next yield instruction and even lets the caller wait in case all the items are already delivered?
Here the code I have used:
import time
startTime = time.time()
time.sleep(1)
def generatorFunctionF():
print("# here: generatorFunctionF() lineNo #1", time.time()-startTime)
for i in range(1,4):
print("# now: time.sleep(1)", time.time()-startTime)
time.sleep(1)
print("# before yield", i, time.time()-startTime)
yield i # yield i
print("# after yield", i, time.time()-startTime)
print("# now: time.sleep(5)", time.time()-startTime)
time.sleep(5)
print("# end followed by 'return'", time.time()-startTime)
return
#:def
def standardFunctionF():
print("*** before: 'gFF = generatorFunctionF()'", time.time()-startTime)
gFF = generatorFunctionF()
print("*** after: 'gFF = generatorFunctionF()'", time.time()-startTime)
print("*** before print(next(gFF)", time.time()-startTime)
print(next(gFF))
print("*** after print(next(gFF)", time.time()-startTime)
print("*** before time.sleep(3)", time.time()-startTime)
time.sleep(3)
print("*** after time.sleep(3)", time.time()-startTime)
print("*** before print(next(gFF)", time.time()-startTime)
print(next(gFF))
print("*** after print(next(gFF)", time.time()-startTime)
print("*** before list(gFF)", time.time()-startTime)
print("*** list(gFF): ", list(gFF), time.time()-startTime)
print("*** after: list(gFF)", time.time()-startTime)
print("*** before time.sleep(3)", time.time()-startTime)
time.sleep(3)
print("*** after time.sleep(3)", time.time()-startTime)
return "*** endOf standardFunctionF"
print()
print(standardFunctionF)
print(standardFunctionF())
gives:
>python3.6 -u "aboutIteratorsAndGenerators.py"
<function standardFunctionF at 0x7f97800361e0>
*** before: 'gFF = generatorFunctionF()' 1.001169204711914
*** after: 'gFF = generatorFunctionF()' 1.0011975765228271
*** before print(next(gFF) 1.0012099742889404
# here: generatorFunctionF() lineNo #1 1.0012233257293701
# now: time.sleep(1) 1.0012412071228027
# before yield 1 2.0023491382598877
1
*** after print(next(gFF) 2.002397298812866
*** before time.sleep(3) 2.0024073123931885
*** after time.sleep(3) 5.005511283874512
*** before print(next(gFF) 5.005547761917114
# after yield 1 5.005556106567383
# now: time.sleep(1) 5.005565881729126
# before yield 2 6.006666898727417
2
*** after print(next(gFF) 6.006711006164551
*** before list(gFF) 6.0067174434661865
# after yield 2 6.006726026535034
# now: time.sleep(1) 6.006732702255249
# before yield 3 7.0077736377716064
# after yield 3 7.0078125
# now: time.sleep(5) 7.007838010787964
# end followed by 'return' 12.011908054351807
*** list(gFF): [3] 12.011950254440308
*** after: list(gFF) 12.011966466903687
*** before time.sleep(3) 12.011971473693848
*** after time.sleep(3) 15.015069007873535
*** endOf standardFunctionF
>Exit code: 0
Generators were designed as a simpler, shorter, easier-to-understand syntax for writing iterators. That was their use case. People who want to make iterators shorter and easier to understand do not want to introduce the headaches of thread synchronization into every iterator they write. That would be the opposite of the design goal.
As such, generators are based around the concept of coroutines and cooperative multitasking, not threads. The design tradeoffs are different; generators sacrifice parallel execution in exchange for semantics that are much easier to reason about.
Also, using separate threads for every generator would be really inefficient, and figuring out when to parallelize is a hard problem. Most generators aren't actually worth executing in another thread. Heck, they wouldn't be worth executing in another thread even in GIL-less implementations of Python, like Jython or Grumpy.
If you want something that runs in parallel, that's already handled by starting a thread or process and communicating with it through queues.