I encountered a problem while writing the python code with a multiprocessing map function. The minimum code to reproduce the problem is like
import multiprocessing as mp
if __name__ == '__main__':
def f(x):
return x*x
num_workers = 2
with mp.Pool(num_workers) as p:
print(p.map(f, [1,2,3]))
If one runs this piece of code, I got the error message
AttributeError: Can't get attribute 'f' on <module '__mp_main__' from 'main.py'>
However, If I move f-function outside the main function, i.e.
import multiprocessing as mp
def f(x):
return x*x
if __name__ == '__main__':
num_workers = 2
with mp.Pool(num_workers) as p:
print(p.map(f, [1,2,3]))
It works this time. I am wondering what's the difference between them and how can I get an error in the first version. Thanks in advance.
Depending on your operating system, sub-processes will either be forked or spawned. macOS, for example, will spawn whereas Windows will fork.
You can enforce forking but you need to fully understand the implications of doing so.
For this specific question a workaround could be implemented thus:
import multiprocessing as mp
from multiprocessing import set_start_method
if __name__ == '__main__':
def f(x):
return x*x
set_start_method('fork')
num_workers = 2
with mp.Pool(num_workers) as p:
print(p.map(f, [1,2,3]))