I have a program in which I would like to run one of the functions in parallel for several arguments.
the program is in the following format:
import statements
def function1():
do something
def function2()
do something
def main():
function1()
I found several examples of how to use the multiprocessing
library online such as the following general template
import multiprocessing
def worker(num):
print 'Worker:', num
return
if __name__ == '__main__':
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,))
p.start()
From what I understand the worker()
is the function which is meant to be performed in parallel. But I am not sure where or how to use the
(if __name__ == '__main__':
) block the code.
as of now that block is in my main()
and when I run the program I do not get the worker function executed multiple times, instead my main gets executed multiple times?
So where is the proper place to put the (if __name__ == '__main__':
) block
Blending together the two examples you provide, it would look like this:
import multiprocessing
def worker(num):
print 'Worker:', num
return
def main():
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,))
p.start()
p.join()
if __name__ == '__main__':
main()
Replace worker
with function1
, i.e. whichever you'd like to parallelise.
The key part is calling that main
function in the if __name__ == '__main__':
block, however in this simple example you could as easily put the code under def main():
under if __name__ == '__main__':
directly.
If you're never going to import anything from this file, you don't even need the if __name__ == '__main__':
part; this is only required if you want to be able to import functions from this script into other scripts/an interactive session without running the code in main()
. See What does if __name__ == "__main__": do?.
So the simplest usage would be:
import multiprocessing
def worker(num):
print 'Worker:', num
return
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,))
p.start()
p.join()
Edit: multiprocessing pool example
import multiprocessing
def worker(num):
#print 'Worker:', num
return num
pool = multiprocessing.Pool(multiprocessing.cpu_count())
result = pool.imap(worker, range(5))
print list(result)
Prints:
[0, 1, 2, 3, 4]
See also Python multiprocessing.Pool: when to use apply, apply_async or map? for more detailed explanations.