Before I begin my question, let me mention that I already know that the following multiprocessing code is broken. There are TOCTOU bugs in it. The following code is meant to serve pedagogical purpose for me, so that I can learn more about how exactly the code is broken. So my question is about a specific aspect of the broken code. First, let me show my code.
For now, you can ignore worker_b
completely because we don't use it anywhere right now. We will come back to it later.
import Queue
import multiprocessing
import time
lock = multiprocessing.Lock()
def pprint(s):
lock.acquire()
print(s)
lock.release()
def worker_a(i, stack):
if stack:
data = stack.pop()
pprint('worker %d got %d' % (i, data))
time.sleep(2)
pprint('worker %d exiting ...' % i)
else:
pprint('worker %d has nothing to do!' % i)
def worker_b(i, stack):
if stack:
data = stack.pop()
pprint('worker %d got %d (stack length: %d)' % (i, data, len(stack)))
time.sleep(2)
pprint('worker %d exiting ... (stack length: %d)' % (i, len(stack)))
else:
pprint('worker %d has nothing to do!' % i)
manager = multiprocessing.Manager()
stack = manager.list()
def master():
for i in range(5):
stack.append(i)
pprint('master put %d' % i)
i = 0
while stack:
t = multiprocessing.Process(target=worker_a, args=(i, stack))
t.start()
time.sleep(1)
i += 1
pprint('master returning ...')
master()
pprint('master returned!')
The above broken code appears to work fine.
$ python mplifo.py
master put 0
master put 1
master put 2
master put 3
master put 4
worker 0 got 4
worker 1 got 3
worker 0 exiting ...
worker 2 got 2
worker 1 exiting ...
worker 3 got 1
worker 2 exiting ...
worker 4 got 0
worker 3 exiting ...
master returning ...
master returned!
worker 4 exiting ...
However, if I call worker_b
instead of worker_a
, i.e. change
t = multiprocessing.Process(target=worker_a, args=(i, stack))
to
t = multiprocessing.Process(target=worker_b, args=(i, stack))
the following error occurs.
$ python mplifo.py
master put 0
master put 1
master put 2
master put 3
master put 4
worker 0 got 4 (stack length: 4)
worker 1 got 3 (stack length: 3)
worker 0 exiting ... (stack length: 3)
worker 2 got 2 (stack length: 2)
worker 1 exiting ... (stack length: 2)
worker 3 got 1 (stack length: 1)
worker 2 exiting ... (stack length: 1)
worker 4 got 0 (stack length: 0)
worker 3 exiting ... (stack length: 0)
master returning ...
master returned!
Process Process-6:
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "mplifo.py", line 27, in worker_b
pprint('worker %d exiting ... (stack length: %d)' % (i, len(stack)))
File "<string>", line 2, in __len__
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 758, in _callmethod
conn.send((self._id, methodname, args, kwds))
IOError: [Errno 32] Broken pipe
worker_b
?pprint()
call in worker_b
and not for the first pprint()
call?This part of the traceback gives you a hint:
File "mplifo.py", line 27, in worker_b
pprint('worker %d exiting ... (stack length: %d)' % (i, len(stack)))
File "<string>", line 2, in __len__
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 758, in _callmethod
conn.send((self._id, methodname, args, kwds))
In the worker process, stack
is not a Python list. It’s a proxy created by multiprocessing.Manager
, which wraps a list that resides in the master process. When the last worker_b
exits, it evaluates len(stack)
, which the proxy has to request from the master process. But by that time the master has already quit — the communication pipe to it is broken.
This doesn’t happen in worker_a
because it does not attempt to evaluate len(stack)
before exiting.