Search code examples
pythonpython-multiprocessing

Why does pool run the entire file multiple times?


I'm trying to understand the output from this Python 2.7.5 example script:

import time
from multiprocessing import Pool

print(time.strftime('%Y-%m-%d %H:%M', time.localtime(time.time())))
props2=[
            '170339',
            '170357',
            '170345',
            '170346',
            '171232',
            '170363',
            ]
def go(x):
     print(x)

if __name__ == '__main__':
    pool = Pool(processes=3)
    pool.map(go, props2)

print(time.strftime('%Y-%m-%d %H:%M', time.localtime(time.time())))  

This yields the output:

2015-08-06 10:13

2015-08-06 10:13

2015-08-06 10:13

170339

170357

170345

170346

171232

170363

2015-08-06 10:13

2015-08-06 10:13

2015-08-06 10:13

My questions are:

A) Why does the time print three times at the beginning and the end? I would have expected it to print the start time, and then the end time.

B) The real question - How do I get it to run one command multiple times, but all the others a single time?


Solution

  • Python imports the __main__ module for each process. On an import, the whole file is executed again. On python 3, if you remove the if __name__ == '__main__' you will get an infinite loop since the file is getting recursively called.

    For the real question:

    In python scripts, I typically try to avoid executing any statements or variables on the global scope except for function definitions. I use the below as a template for all python scripts.

    import sys
    
    def main(argv):
      #main logic here
    
    if __name__ == '__main__':
      main(sys.argv)
    

    When you have a script with re-usable functions, even if it has a main method, you can import it into another script if you need to.