python windows multiprocessing process-pool

Multiprocessing.pool.Pool on Windows: CPU limit of 63?

When using python's multiprocessing.pool.Pool with more than 63 cores, I get a ValueError:

from multiprocessing.pool import Pool

def f(x):
    return x

if __name__ == '__main__':
    with Pool(70) as pool:
        arr = list(range(70))
        a = pool.map(f, arr)
        print(a)

Output:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\fischsam\Anaconda3\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\fischsam\Anaconda3\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\fischsam\Anaconda3\lib\multiprocessing\pool.py", line 519, in _handle_workers
    cls._wait_for_updates(current_sentinels, change_notifier)
  File "C:\Users\fischsam\Anaconda3\lib\multiprocessing\pool.py", line 499, in _wait_for_updates
    wait(sentinels, timeout=timeout)
  File "C:\Users\fischsam\Anaconda3\lib\multiprocessing\connection.py", line 879, in wait
    ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout)
  File "C:\Users\fischsam\Anaconda3\lib\multiprocessing\connection.py", line 811, in _exhaustive_wait
    res = _winapi.WaitForMultipleObjects(L, False, timeout)
ValueError: need at most 63 handles, got a sequence of length 72
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69]

The program seems to work fine; the result is just what I expected. Can I ignore the ValueError?

Background: I have googled this issue, and it seems having to do with limitations of python on Windows (_winapi.WaitForMultipleObjects); see e.g. here. The suggested fix is to limit the number of used cores to 63. This is not satisfactory, because I want to 100+ cores on my server. Do I really need to limit the cores? Why? Is there a workaround?

Solution

I have written a package ("ObjectProxyPool") to do object-based parallelism with any number of parallel processes.

(Object-based parallelism: You have a copy of an object in each parallel process and can call its methods without having to re-initialize the object.)

The package is available on PyPi. You can use the tool to do stuff similar to classical multiprocessing.

import os
from objectproxypool import ProxyPool
import numpy as np

# Define a dummy class containing only the function 
# you are interested in
class MyClass:
    
    # Implement the function you are interested in.
    # Static methods are not supported.
    # You may just ignore the self argument.
    def f(self, x):
        return os.getpid()

if __name__ == '__main__':
    
    # Create a proxy pool with 70 workers in independent processes. 
    # If `separateProcesses = False`, the workers will be threads.
    with ProxyPool(MyClass, numWorkers=70, separateProcesses=True) as myObject:
        
        # myObject is a proxy to the object of type MyClass.
        # `map_args = True` assures that each worker gets one argument at a time.
        # Otherwise, each worker will receive the argument `range(100)`
        a = myObject.f(range(100), map_args=True)
        
    print("We used {} different processes.".format(np.unique(a).size))
    print(a)