Search code examples
pythonpython-3.xmultiprocessingpython-multiprocessingmutex

Python script not creating enough output lines


I need to write 1,000,000 random numbers to a single file using multiprocessing in Python and take note of how it affects execution time. For some reason, whenever I run the script I am only ending up with 997,716 numbers.

Here's my code:

import time
import random
from multiprocessing import Process, Lock

mutex = Lock()

def task1():
    for i in range(500000):
        mutex.acquire()
        f.write(str(random.randint(0,1000)) + "\n")
        mutex.release()
       
def task2():
    for i in range(500000):
        mutex.acquire()
        f.write(str(random.randint(0,1000)) + "\n")
        mutex.release()

start = time.time()

f = open("file2.txt","a")
       
p1 = Process(target = task1, args = ())
p2 = Process(target = task2, args = ())

p1.start()
p2.start()

end = time.time()

print('Execution Time: {}'.format(end-start))

Solution

  • each process has its own in-memory buffer for the file (as python buffers IO), you need to explicitly flush those buffers to disk inside the lock if you want to be sure no race conditions happens.

    def task1():
        for i in range(500000):
            mutex.acquire()
            f.write(str(random.randint(0,1000)) + "\n")
            f.flush()
            mutex.release()
    

    flushing is a slow function, so you may want to have only 1 process doing all the writing to disk and other processes sending things to write through a multiprocessing.Queue


    Fork BUG

    there's currently a bug in Cpython, when using fork (which is the default process creation on linux) the forked children never garbage collect the global scope, so your file is not flushed at the end of the child's function, and some data written to the file may be lost.

    a solution to this is to either

    1. .close() or .flush() the file descriptor at the end of the worker function.
    2. make the open("filename",'w') call inside the child process instead of the parent. (first close the file descriptor in the child if it was already open in the parent)

    this won't solve the synchronization problem between the workers, it will just make sure they don't terminate before they flush their internal buffers.