python multiprocessing python-multiprocessing

Multiprocessing in Python2.7

I'm trying to understand "multiprocessing" module more through examples before i start applying it to my main code,and i get little confused from the execution sequence in this code .

The Code :

import multiprocessing as mp
import time
import os

def square( nums , r , t1 ) :
    print ("square started at :")
    print ("%.6f" % (time.clock()-t1))
    for n in nums :
        r.append(n*n)
    print ("square endeded at :")
    print ("%.6f" % (time.clock()-t1))

def cube ( nums , r , t1 ) :
    #time.sleep(2)
    print ("cube started at   :")
    print ("%.6f" % (time.clock()-t1))
    for n in nums :
        r.append(n*n*n)
    print ("cube endeded at   :")
    print ("%.6f" % (time.clock()-t1))


if __name__ == "__main__" :
    numbers = range(1,1000000)
    results1 = []
    results2 = []
    t1 = time.clock()

    # With multiprocessing :
    p1 = mp.Process(target = square , args = (numbers , results1 , t1))
    p2 = mp.Process(target = cube   , args = (numbers , results2 , t1))
    p1.start()
    #time.sleep(2)
    p2.start()
    p1.join()
    print ("After p1.join()   :")
    print ("%.6f" % (time.clock()-t1))
    p2.join()

    '''
    # Without multiprocessing :
    square(numbers , results1 ,t1)
    cube(numbers , results2 , t1)

    '''
    print ("square + cube :")
    print ("%.6f" % (time.clock()-t1))

The code output was :

square started at :  
0.000000  
square endeded at :  
0.637105  
After p1.join()   :  
12.310289  
cube started at   :  
0.000000  
cube endeded at   :  
0.730428  
square + cube :  
13.057885

And i have few questions :

according to the code and timing above should it be in this order ?

square started at : cube started at : square endeded at : cube endeded at : After p1.join() : square + cube :

why it takes so long from the program to reach (p1.join()) despite of it finished the "square" several seconds earlier ?
in another word why square & cube take around 13 seconds to run while there real time execution is 0.7s!
in my main code i would like to start the second function (cube in this example) after a one second delay from the first function,so i tried to put a delay (time.sleep(1)) between "p1.start()" and "p2.start()"but it did not work and both functions still starting at (0.000000s),then i placed the delay at the the beggining of the "cube " function and it also did not work , so my question is how to achive a delay between this two functions ?

Solution

When dealing with multithreading, all sorts of other factors can impact what you are seeing. Since you are literally adding subprocesses to the process manager of your OS, they will operate entirely separately from your running program, including having their own resources, scheduling priorities, pipes, etc.

1.) No. The reason is that each child process gets its own output buffer that it is writing into that gets written back into the parent process. Since you start both child processes and then tell the parent process to block the thread until subprocess p1 completes, the p2 child process cannot write its buffer into the parent until the p1 process completes. This is why, despite waiting 12 seconds, the output of the p2 process still says 0.7 seconds.

2.) It is difficult to know for certain why it took 12 seconds for the subprocess to run its course. It may be something in your code or it may be a dozen other reasons, such as a complete different process hijacking your CPU for a time. First off, time.clock is probably not what you are looking for if you are trying to measure actual time versus how much time the process has spent on the CPU. Other commenters have correctly recommended using a high performance counter to accurately track timings to make sure there is not any weirdness in the way you are measuring time. Furthermore, there is always some level of overhead, though certainly not 12 seconds worth, when starting, running and terminating a new process. The best way to determine if this 12 seconds is something you could have controlled for or not is to run the application multiple times and see if there is a wide variance of total resulting times. If there is, it may be other conditions related to the computer running it.

3.) I am guessing that the issue is the time.clock measurement. The time.clock call calculates how much time a process has spent on the CPU. Since you are using multiple processes, time.clock is reset to 0 when a process is started. It is a relative time, not an absolute time, and relative to the lifespan of the process. If you are jumping between processes or sleeping threads, time.clock won't necessarily increment the way you would think with an absolute time measurement. You should use something like time.time() or better yet, a high performance counter to properly track real time.