Search code examples
pythonmultiprocessingqueueprogram-entry-point

Code wont excuted by order in __name__ == '__main__'


Im trying some multiprocessing as the example code below, it should print message from 01,02,03,hello world,04,05, but it went 01,02,01,05,03,hello world,04,05 instead, why 02 back to 01 then jump to 05, back to 03, why am i missing here, how to let it run by order, Thank you!

from multiprocessing import *

# large data/complex use multiprocessing , else use odinary function
q = Queue() # comm between parent n child proces
    

def f1(x,q):
    print('03')
    x = x + " world"
    q.put(x)


def main_f():
    print('01')
    mp = Process(target=f1,args=("hello",q,))

    if __name__ == '__main__': # only happen once, else ex 4 process to 16 to 64 endless
        print('02')

        mp.start()
        print(q.get())
        mp.join()

        print('04')

main_f()
print('05')

I expect the message print from 01,02,03,hello world,04,05


Solution

  • When you do multiprocessing under a platform that uses the spawn method to create new processes, then any code at global scope that is not within a if __name__ == '__main__': block will first be executed by the child process in order to initialize its storage prior to invoking the worker function f1.

    In your posted code, when the child process is created it will therefore execute the following statements in order:

    1. from multiprocessing import *
    2. q = Queue() # create global queue
    3. def f1(x, q): # create function definition
    4. def main_f(): # create function definition
    5. main_f() # call main_f
    6. print('05')

    In reality the only statement that needs to be executed by the child process before the worker method f1 is invoked is statement #3 above, which defines the worker function for the child process.

    Statement 1 imports a package not used by your child process. Doing this does not prevent the program from running correctly but Python is spending time performing animport that is not used.

    Statement #2 needlessly creates a new queue instance in the child process distinct from the one created in the main process. It would be disastrous if your child process used this since it would be putting elements on a different queue than the one the main process is getting from. Fortunately, function f1 is not referencing and using the queue that is passed as an argument.

    Statement #4 defines a function not used by the child process. It doesn't prevent the program from running but is wasteful.

    Statement #5 invokes main_f. This is where your troubles begins. All the code within main_f that is not within a if __name__ == '__main__': block will get executed immediately before your worker function is invoked. This is what is causing an extra '01' to be printed.

    Statement #6 likewise is what is causing an extra '05' to be printed.

    At the minimum to get your program working correctly, your code should therefore be:

    from multiprocessing import *
    
    def f1(x,q):
        print('03')
        x = x + " world"
        q.put(x)
    
    
    def main_f():
        # large data/complex use multiprocessing, else use ordinary function
        q = Queue() # comm between parent n child process
    
        print('01')
        mp = Process(target=f1,args=("hello",q,))
    
        print('02')
    
        mp.start()
        print(q.get())
        mp.join()
    
        print('04')
    
    if __name__ == '__main__':
        main_f()
    

    If we want to eliminate all possible inefficiencies, i.e. prevent unnecessary statements from being executed when the child process is initialized, then:

    def f1(x,q):
        print('03')
        x = x + " world"
        q.put(x)
    
    if __name__ == '__main__':
        from multiprocessing import *
    
        def main_f():
            # large data/complex use multiprocessing, else use ordinary function
            q = Queue() # comm between parent n child process
    
            print('01')
            mp = Process(target=f1,args=("hello",q,))
    
            print('02')
    
            mp.start()
            print(q.get())
            mp.join()
    
            print('04')
    
        main_f()