1 from multiprocessing import Pool, Manager
2
3
4 def test(num):
5 queue.put(num)
6
7
8 queue = Manager().Queue()
9 pool = Pool(5)
10
11 for i in range(30):
12 pool.apply_async(test, (i, ))
13
14 pool.close()
15 pool.join()
16
17 print(queue.qsize())
The output of the code above is 30. However, if line 8 is exchanged with line 9 (see the code below), the output will be 0. So is there anybody who knows why? Thank you!
1 from multiprocessing import Pool, Manager
2
3
4 def test(num):
5 queue.put(num)
6
7
8 pool = Pool(5)
9 queue = Manager().Queue()
10
11 for i in range(30):
12 pool.apply_async(test, (i, ))
13
14 pool.close()
15 pool.join()
16
17 print(queue.qsize())
from multiprocessing import Process, Queue
def test():
queue.put(1)
p = Process(target=test)
queue = Queue()
p.start()
p.join()
print(queue.qsize())
The output is 1, which means the child process put the number into the queue created by the parent. Is that correct?
I assume you are using a Unix based operating system as on NT ones your logic would most likely break.
To understand what happens we need to dig into the multiprocessing
internals. On Unix, when creating a new process, the fork
primitive is used. When a process forks, the parent continues its execution and the child starts as an exact copy of the parent.
Python tends to hide a lot of things in the multiprocessing
module (I particularly dislike that) and leads to lots of misunderstandings. In your logic, the fork
happens when you create the Pool
(line 9 in the first example, 8 in the second).
In the first example, the children inherits the same queue
object the parent created. Therefore, they successfully manage to communicate as they share the same channel.
In the second one instead, parent and children create their own separated queue
objects which are totally independent. When a child puts an element in the queue
it puts it in its own one which is not shared by anybody.
In the third and last example you create a Process
object, then a Queue
and then you call start
on the process. Guess when the fork
happens? When you call start
and not when you create the Process
object. That's why the queue
is successfully shared. This is what I mean when I say that multiprocessing
APIs are a bit misleading.