Search code examples
pythonpython-2.7concurrencyipcpython-multiprocessing

Why does "Broken pipe" error occur only while accessing a shared list in a specific scenario of multiprocessing?


Before I begin my question, let me mention that I already know that the following multiprocessing code is broken. There are TOCTOU bugs in it. The following code is meant to serve pedagogical purpose for me, so that I can learn more about how exactly the code is broken. So my question is about a specific aspect of the broken code. First, let me show my code.

For now, you can ignore worker_b completely because we don't use it anywhere right now. We will come back to it later.

import Queue
import multiprocessing
import time

lock = multiprocessing.Lock()

def pprint(s):
    lock.acquire()
    print(s)
    lock.release()

def worker_a(i, stack):
    if stack:
        data = stack.pop()
        pprint('worker %d got %d' % (i, data))
        time.sleep(2)
        pprint('worker %d exiting ...' % i)
    else:
        pprint('worker %d has nothing to do!' % i)

def worker_b(i, stack):
    if stack:
        data = stack.pop()
        pprint('worker %d got %d (stack length: %d)' % (i, data, len(stack)))
        time.sleep(2)
        pprint('worker %d exiting ... (stack length: %d)' % (i, len(stack)))
    else:
        pprint('worker %d has nothing to do!' % i)

manager = multiprocessing.Manager()
stack = manager.list()

def master():
    for i in range(5):
        stack.append(i)
        pprint('master put %d' % i)

    i = 0
    while stack:
        t = multiprocessing.Process(target=worker_a, args=(i, stack))
        t.start()
        time.sleep(1)
        i += 1

    pprint('master returning ...')

master()

pprint('master returned!')

The above broken code appears to work fine.

$ python mplifo.py 
master put 0
master put 1
master put 2
master put 3
master put 4
worker 0 got 4
worker 1 got 3
worker 0 exiting ...
worker 2 got 2
worker 1 exiting ...
worker 3 got 1
worker 2 exiting ...
worker 4 got 0
worker 3 exiting ...
master returning ...
master returned!
worker 4 exiting ...

However, if I call worker_b instead of worker_a, i.e. change

        t = multiprocessing.Process(target=worker_a, args=(i, stack))

to

        t = multiprocessing.Process(target=worker_b, args=(i, stack))

the following error occurs.

$ python mplifo.py
master put 0
master put 1
master put 2
master put 3
master put 4
worker 0 got 4 (stack length: 4)
worker 1 got 3 (stack length: 3)
worker 0 exiting ... (stack length: 3)
worker 2 got 2 (stack length: 2)
worker 1 exiting ... (stack length: 2)
worker 3 got 1 (stack length: 1)
worker 2 exiting ... (stack length: 1)
worker 4 got 0 (stack length: 0)
worker 3 exiting ... (stack length: 0)
master returning ...
master returned!
Process Process-6:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "mplifo.py", line 27, in worker_b
    pprint('worker %d exiting ... (stack length: %d)' % (i, len(stack)))
  File "<string>", line 2, in __len__
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 758, in _callmethod
    conn.send((self._id, methodname, args, kwds))
IOError: [Errno 32] Broken pipe
  • Why does this error occur only in case of worker_b?
  • Why does this error occur only for the the second pprint() call in worker_b and not for the first pprint() call?

Solution

  • This part of the traceback gives you a hint:

      File "mplifo.py", line 27, in worker_b
        pprint('worker %d exiting ... (stack length: %d)' % (i, len(stack)))
      File "<string>", line 2, in __len__
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/managers.py", line 758, in _callmethod
        conn.send((self._id, methodname, args, kwds))
    

    In the worker process, stack is not a Python list. It’s a proxy created by multiprocessing.Manager, which wraps a list that resides in the master process. When the last worker_b exits, it evaluates len(stack), which the proxy has to request from the master process. But by that time the master has already quit — the communication pipe to it is broken.

    This doesn’t happen in worker_a because it does not attempt to evaluate len(stack) before exiting.