In Python 3.8, concurrent.futures.ProcessPoolExecutor
has been updated to limit the max number of workers (processes) able to be used on Windows to 61. For the reasons why, see this and this, but to my understanding:
multiprocessing
calls the Windows API function WaitForMultipleObjects
, which is used to wait for processes to finish. It can wait on, at most, 63 objects, less the result queue reader and thread wakeup reader, hence the 61 limit. (i.e. Windows uses a thread per process to track processes).(see also this SO issue)
multiprocessing
, however, still uses os.cpu_count()
. It throws a Value Error
at first, but then continues on and uses 100% of my CPU cores. For example,
Exception in thread Thread-N:
Traceback (most recent call last):
File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\pool.py", line 519, in _handle_workers
cls._wait_for_updates(current_sentinels, change_notifier)
File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\pool.py", line 499, in _wait_for_updates
wait(sentinels, timeout=timeout)
File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\connection.py", line 879, in wait
ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout)
File "C:\Users\username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\connection.py", line 811, in _exhaustive_wait
res = _winapi.WaitForMultipleObjects(L, False, timeout)
ValueError: need at most 63 handles, got a sequence of length 98
where my machine has 96 cores. Is this "error" really an error? If not, should I just use the multiprocessing
module instead of the concurrent.futures
module, which limits my CPU usage to 61 cores?
Edit: I suspect it is an error as I assume multiprocess
will continue to wait for the process that threw the error to finish. This seems to happen if I don't limit the number of cores (the program just hangs after the CPU usage dies down). However, I'm not sure if it really is.
Yours is an excellent question. Looking at the code it would appear that this would be an unrecoverable error. But it seems to me incomprehensible that there would be code in the ThreadPoolExecutor
to limit the pool size to 61 under Windows and not enforce that for the the multiprocessing.Pool
class. Anyway, it should be easy enough to check with the following program. If it does not print Done! and hangs, I would say there is definitely a problem and you should explicitly limit the pool size if you are using multiprocessing.Pool
:
import multiprocessing
def worker(x):
return x ** 2
def main():
pool = multiprocessing.Pool(96)
results = pool.map(worker, range(96))
assert len(results) == 96
pool.close()
pool.join()
print('Done!')
if __name__ == '__main__':
main()
But the fact that your program hangs is fairly conclusive that the above program will hang and I suspect you will not even get to the assert
statement. Either way, using a pool size greater than 61 will not be reliable.