I submitted a question a week ago about persistent processes after terminating the ProcessPoolExecutor, but there have been no replies. I think this might be because not enough people are familiar with how ProcessPoolExecutor is coded, so I thought it would be helpful to ask a more general question to those who use the multiprocessing module.
In the Python documentation, it states that
On POSIX using the spawn or forkserver start methods will also start a resource tracker process which tracks the unlinked named system resources (such as named semaphores or SharedMemory objects) created by processes of the program. When all processes have exited the resource tracker unlinks any remaining tracked object.
However, there is nothing in the documentation stating how to shut down this resource tracker when it is no longer needed. As far as I can tell, the tracker PID is not available to the ProcessPoolExecutor, but I did read somewhere that it might be accessible using a Pool instead. Can anyone confirm if this is true before I refactor my code?
You may use an internal method _stop
to achieve this, but ... it should be done with caution due to the potential risks involved while using internal and/or undocumented features,
Below an example of code demonstrating what is said above:
from concurrent.futures import ProcessPoolExecutor
from multiprocessing import resource_tracker
def example_function(x):
return x * x
if __name__ == '__main__':
with ProcessPoolExecutor() as executor:
results = list(executor.map(example_function, range(10)))
# Manually stop the resource tracker
resource_tracker._resource_tracker._stop()
print("Resource tracker stopped.")
The code exits on my Linux Mint 21.2 Xfce machine with exit code 0, so I assume it does what it is intended to do.