Search code examples
python-3.xmultiprocessingthreadpoolpython-multiprocessingpool

How to get the process number in the function when using Pool from multiprocessing


I am trying to get current process number in my function when multiprocessing using the Pool. Here is the code that i am using to test this:

from multiprocessing.dummy import Pool
import itertools

def function(a,b,c):
    print("Value of a: {} Value of b : {} Constant : {}".format(a,b,c))

a = [4,5,6,7,8]
b = [11,12,13,14,15]

pool = Pool(3)
pool.starmap(function, zip(a,b,itertools.repeat(50)))
pool.close()
pool.join()

Right now my output for the function looks like this:

Value of a: 4 Value of b : 11 Constant : 50
...

What i really want is to also get the Current Process number in my function to notify me exactly which Process is running current iteration of function Something like this:

Value of a: 4 Value of b : 11 Constant : 50 Process : 1
Value of a: 5 Value of b : 12 Constant : 50 Process : 2
Value of a: 6 Value of b : 13 Constant : 50 Process : 3

I tried using multiprocessing.current_process().ident But it's showing this output:

Value of a: 4 Value of b : 11 Constant : 50 Thread : 33084
Value of a: 5 Value of b : 12 Constant : 50 Thread : 33084
Value of a: 6 Value of b : 13 Constant : 50 Thread : 33084

Should i use any other method or attribute from multiprocessing to get the current process number?


Solution

  • You are using multiprocessing.dummy.Pool, which is actually a thread pool, not a process pool. So everything is still running in a single process, which means that each thread is going to have the same ident value for multiprocesing.current_process(). If you intended to use a thread pool, you could use threading.current_thread().ident to get a unique ID for each thread.

    If you intend to use a process pool, then multiprocessing.current_process().ident will work the way you expect once you switch. You could also use os.getpid(), which (at least on Linux) returns the same value.

    If you want each thread to have a monotonically increasing ID that counts up from 1, you can do that by assigning the identifier yourself when each thread starts up, like this:

    from multiprocessing.dummy import Pool
    import itertools
    
    def function(a,b,c):
        print("Value of a: {} Value of b : {} Constant : {} ID: {}".format(a,b,c,d.id))
    
    a = [4,5,6,7,8]
    b = [11,12,13,14,15]
    
    d = threading.local()
    def set_num(counter):
        d.id = next(counter) + 1
    
    pool = Pool(3, initializer=set_num, initargs=(itertools.count(),))
    
    pool.starmap(function, zip(a,b,itertools.repeat(50)))
    pool.close()
    pool.join()
    

    itertools.count() is thread-safe, so it can be used to assign a unique identifier to each thread in the pool when they are initialized. You can then use a threading.local object to store the unique ID for each thread.

    If you don't care about actually have an integer value, you can just use threading.current_thread().name, which will print a string that has an integer suffix, which counts up from 1.