I need to convert a threading
application to a multiprocessing
application for multiple reasons (GIL, memory leaks). Fortunately the threads are quite isolated and only communicate via Queue.Queue
s. This primitive is also available in multiprocessing
so everything looks fine. Now before I enter this minefield I'd like to get some advice on the upcoming problems:
Queue
? Do I need to provide some __setstate__
?put
returning instantly (like with threading
Queue
s)?Answer to part 1:
Everything that has to pass through a multiprocessing.Queue
(or Pipe
or whatever) has to be picklable. This includes basic types such as tuple
s, list
s and dict
s. Also classes are supported if they are top-level and not too complicated (check the details). Trying to pass lambda
s around will fail however.
Answer to part 2:
A put
consists of two parts: It takes a semaphore to modify the queue and it optionally starts a feeder thread. So if no other Process
tries to put
to the same Queue
at the same time (for instance because there is only one Process
writing to it), it should be fast. For me it turned out to be fast enough for all practical purposes.
Partial answer to part 3:
multiprocessing.queue.Queue
lacks a task_done
method, so it cannot be used as a drop-in replacement directly. (A subclass provides the method.)processing.queue.Queue
lacks a qsize
method and the newer multiprocessing
version is inaccurate (just keep this in mind).fork
, care needs to be taken about closing them in the right processes.