Search code examples
pythonmultithreadingpool

Why do I see extra line breaks when working with python multiprocessing pool?


Example:

from multiprocessing.dummy import Pool as ThreadPool

def testfunc(string):
    print string

def main():

    strings = ['one', 'two', 'three', ...]
    pool = ThreadPool(10)
    results = pool.map(testfunc, strings)
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()

This will not give us clear results with one result in one line:

one
two 
three

But mesh, that has random linebreaks, like

one 


two
three

four
five
...

Why does it happen? Can i output my data with one linebreak per function call?

P.S. Sometimes i have even no linebreaks or even spaces! P.P.S. Working under windows


Solution

  • print is a non-atomic operation, so one print can be interrupted in the middle by another print in a different process. You can prevent two processes from calling print simultaneously by putting a Lock around it.

    from multiprocessing.dummy import Pool as ThreadPool
    from multiprocessing import Lock
    
    print_lock = Lock()
    def testfunc(string):
        print_lock.acquire()
        print string
        print_lock.release()
    
    def main():
    
        strings = ['one', 'two', 'three', 'four', 'five']
        pool = ThreadPool(10)
        results = pool.map(testfunc, strings)
        pool.close()
        pool.join()
    
    if __name__ == '__main__':
        main()