Search code examples
pythonlinuxmultiprocessingwindow

Why print three times when using python multiprocessing on windows?


Consider the following sample code:

from multiprocessing import Pool


def f(k):
    return k*k


ks = [1, 2, 3]
print("Hello")

if __name__ == '__main__':
    pool = Pool(2)
    k2 = pool.map(f, ks)
    pool.close()
    pool.join()
    print(k2)

On windows, the output is:

Hello
Hello
Hello
[1, 4, 9]

which is weird and ugly, not what I expected.

Same code on Linux, the output is:

Hello
[1, 4, 9]

which is what I expected.

Why three print on windows? I think in the same manner, the ks must also have been defined three times and maybe the import and function definition also been done three repetitive times. This is time and resource wasting, I don't know why the design on windows is like this.

OK, face to the facts, should I define all the variables and move all calculations outside if __name__=="__main__" to the inside to avoid the resource wasting? BTW, move the function definition inside will cause error.


Solution

  • Linux supports fork, an operation that can split one process into two at any point in the code.

    Windows does not support fork and creating a sub-process is more complicated. Data has to be serialized, the program has to be (re-) executed to some degree (the code on the module level).