Consider the following sample code:
from multiprocessing import Pool
def f(k):
return k*k
ks = [1, 2, 3]
print("Hello")
if __name__ == '__main__':
pool = Pool(2)
k2 = pool.map(f, ks)
pool.close()
pool.join()
print(k2)
On windows, the output is:
Hello
Hello
Hello
[1, 4, 9]
which is weird and ugly, not what I expected.
Same code on Linux, the output is:
Hello
[1, 4, 9]
which is what I expected.
Why three print
on windows? I think in the same manner, the ks
must also have been defined three times and maybe the import and function definition also been done three repetitive times. This is time and resource wasting, I don't know why the design on windows is like this.
OK, face to the facts, should I define all the variables and move all calculations outside if __name__=="__main__"
to the inside to avoid the resource wasting? BTW, move the function definition inside will cause error.
Linux supports fork, an operation that can split one process into two at any point in the code.
Windows does not support fork and creating a sub-process is more complicated. Data has to be serialized, the program has to be (re-) executed to some degree (the code on the module level).