python parallel-processing nested multiprocessing pool

Can you do nested parallelization using multiprocessing in Python?

I am new to multiprocessing in Python and I am trying to do the following:

import os
from multiprocessing import Pool
from random import randint

def example_function(a):

    new_numbers = [randint(1, a) for i in range(0, 50)]

    with Pool(processes=os.cpu_count()-1) as pool:
        results = pool.map(str, new_numbers)

    return results


if __name__ == '__main__':

    numbers = [randint(1, 50) for i in range(0, 50)]

    with Pool(processes=os.cpu_count()) as pool:
        results = pool.map(example_function, numbers)

    print("Final results:", results)

However, when running this I get: "AssertionError: daemonic processes are not allowed to have children".

Interchanging either pool.map for a for loop does make it work. E.g. for the second one:

results = []
for n in numbers:
    results.append(example_function(n))

However, since both the outer and inner tasks are very intensive I would like to be able to parallelize both. How can I do this?

Solution

multiprocessing.Pool creates processes with the daemon flag set to True. According to the Python documentation of the Process class, this prevent sub-processes to be created in worker processes:

The process’s daemon flag, a Boolean value. This must be set before start() is called.
The initial value is inherited from the creating process. When a process exits, it attempts to terminate all of its daemonic child processes.
Note that a daemonic process is not allowed to create child processes. Otherwise a daemonic process would leave its children orphaned if it gets terminated when its parent process exits. Additionally, these are not Unix daemons or services, they are normal processes that will be terminated (and not joined) if non-daemonic processes have exited.

Theoretically, you can create your own pool and use a custom context that bypass the process creation to create non-daemonic process. However, you should not do that because the termination of processes would be unsafe as stated in the documentation.

In fact, creating pools in pools is not a good idea in practice as each process of the pool will create another pool of processes. This results in a lot of processes being created which is very inefficient. In some cases, the number of processes would be too big for the OS to be able to create them (there is a limit dependent of the platform). For example, on a many core processor like a recent 64-core AMD threadripper processor with 128 threads, the total number of processes will be 128 * 128 = 16384 which is clearly not reasonable.

The usual solution to solve this problem is to reason about tasks and not processes. Tasks can be added to a shared queue, so tacks can be computed by workers, and then workers can spawn new tasks by adding new tasks in the shared queue. AFAIK, multiprocessing managers are useful to design such a system.