Search code examples
pythonpython-multiprocessingpool

Can one terminate a python process which is a worker in a pool?


Each worker runs a long CPU-bound computation. The computation depends on parameters that can change anytime, even while the computation is in progress. Should that happen, the eventual result of the computation will become useless. We do not control the computation code, so we cannot signal it to stop. What can we do?

  1. Nothing: Let the worker complete its task and somehow recognize afterwards that the result is incorrect and must be recomputed. That would means continuing using a processor for a useless result, possibly for a long time.
  2. Don't use Pool: Create and join the processes as needed. We can then terminate the useless process and create another one. We can even keep bounds on the number of processes existing simultaneously. Unfortunately, we will not be reusing processes.
  3. Find a way to terminate and replace a Pool worker: Is terminating a Pool worker even possible? Will Pool create replace the terminated one? If not, is there an external way of creating a new worker in a pool?

Solution

  • Given the strict "can't change computation code" limitation (which prevents checking for invalidation intermittently), your best option is probably #2.

    In this case, the downside you mention for #2 ("Unfortunately, we will not be reusing processes.") isn't a huge deal. Reusing processes is an issue when the work done by a process is small relative to the overhead of launching the process. But it sounds like you're talking about processes that run over the course of seconds or longer; the cost of forking a new process (default on most UNIX-likes) is a trivial fraction of that, and spawning a process (default behavior on MacOS and Windows) is typically still measured in small fractions of a second.

    For comparison:

    Option #1 is wasteful; if you're anywhere close to using up your cores, and invalidation occurs with any frequency at all, you don't want to leave a core chugging on garbage indefinitely.

    Option #3, even if it worked, would work only by coincidence, and might break in a new release of Python, since the behavior of killing workers explicitly is not a documented feature.