Search code examples
pythonnumpypython-multiprocessingmontecarlo

Issues when parallelize Monte Carlo method with python Multiprocessing


I want to speed up a Monte Carlo Method which bases on numpy manipulation with the Multiprocessing module. I've read enter link description here and wrote a code like follows for some tasks:

import func1, func2, func3, ... #some manipulations on the SAME numpy ndarray but each of then works independantly returns a independant result
import multiprocessing as mp
if __name__ == '__main__':
   with mp.Pool(processes=mp.cpu_count()) as pool:
   task1 = pool.Process(target=fun1, args(arg1, arg2, ...)
   task2 = pool.Process(target=fun2, args(arg1, arg2, ...)
   task3 = pool.Process(target=fun3, args(arg1, arg2, ...)
   ...
   task1.start()
   task2.start()
   task3.start()
   ...
   variable1 = task1.join() #In my case, I need to get the returns of these functions
   variable2 = task2.join()
   variable3 = task3.join()
   ...

Like most of the tutorials. But I got a

RuntimeError:An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

I'm really new in this field and struggling for days before I post this question here. Could somebody kindly give me some suggestions?


Solution

  • I figure out this RunTimeError issue when I run the same programme on a MacOS. Actually, this RuntimeError is due to the way how Windows compile a .py file.

    To correct this, the simpliest way is to pass the body of the programme into a function main() (eventhough in my case, it's complicated) then add the freeze_support() in the multipleprocessing module. So it will finally look like this: import func1, func2, func3, ... #some manipulations on the SAME numpy ndarray but each of then works independantly returns a independant result

    import multiprocessing as mp
    from multiprocessing import freeze_support()
    def main():
       with mp.Pool(processes=mp.cpu_count()) as pool:
          task1 = pool.Process(target=fun1, args(arg1, arg2, ...)
          task2 = pool.Process(target=fun2, args(arg1, arg2, ...)
          task3 = pool.Process(target=fun3, args(arg1, arg2, ...)
          ...
          task1.start()
          task2.start()
          task3.start()
          ...
          variable1 = task1.join() #In my case, I need to get the returns of these functions
          variable2 = task2.join()
          variable3 = task3.join()
          ...
    if __name__ == '__main__':
       freeze_support()
       main()