I want to run a function in python in a new process, do some work, return progress to the main process using a queue and wait on the main process for termination of the spawned process and then continue execution of the main process.
I got the following code, which runs the function foo in a new process and returns progress using a queue:
import multiprocessing as mp
import time
def foo(queue):
for i in range(10):
queue.put(i)
time.sleep(1)
if __name__ == '__main__':
mp.set_start_method('spawn')
queue = mp.Queue()
p = mp.Process(target=foo, args=(queue,))
p.start()
while p.is_alive():
print("ALIVE")
print(queue.get())
time.sleep(0.01)
print("Process finished")
The output is:
ALIVE
0
ALIVE
1
ALIVE
2
ALIVE
3
ALIVE
4
ALIVE
5
ALIVE
6
ALIVE
7
ALIVE
8
ALIVE
9
ALIVE
At some point neither "Alive" nor "Process finished" is printed. How can I continue execution when the spawned process stops running?
*Edit
The problem was that I didn't know that queue.get() blocks until an item is put into the queue if the queue is empty. I fixed it by changing
while p.is_alive():
print(queue.get())
time.sleep(0.01)
to
while p.is_alive():
if not queue.empty():
print(queue.get())
time.sleep(0.01)
Your code has a race condition. After the last number is put into the queue, the child process sleeps one more time before it exits. That gives the parent process enough time to fetch that option, sleep for a shorter time, and then conclude that the child is still alive before waiting for an 11th item that never comes.
Note that you get more ALIVE
reports in your output than you do numbers. That tells you where the parent process is deadlocked.
There are a few possible ways you could fix the issue. You could change the foo
function to sleep first, and put the item into the queue afterwards. That would make it so that it could quit running immediately after sending the 9
to its parent, which would probably allow it to avoid the race condition (since the parent does sleep for a short time after receiving each item). There would still be a small possibility of the race happening if things behaved very strangely, but it's quite unlikely.
A better approach might be to prevent the possibility of the race from occurring at all. For example, you might change the queue.get
call to have a timeout
set, so that it will give up (with a queue.Empty
exception) if there's nothing to retrieve for too long. You could catch that exception immediately, or even use it as a planned method of breaking out of the loop rather than testing if the child is still alive or not, and catching it at a higher level.
A final option might be to send a special sentinel value from the child to the parent in the queue to signal when there will be no further values coming. For instance, you might send None
as the last value, just before the foo
function ends. The parent code could check for that specific value and break out if its loop, rather than treating it like a normal value (and e.g. printing it). This sort of positive signal that the child code is done might be better than the negative signal of a timeout, since it's less likely for something going wrong (e.g. the child crashing) being misinterpreted as the expected shutdown.