Search code examples
pythonpython-3.xmultiprocessingpython-multiprocessingrace-condition

How to make a multiprocessing.Array process safe in python


While constructing a multiprocessing.Array the constructor takes an argument lock. It is default set to True

It is my understanding that when lock flag is set to True the multiprocessing.Array thus created should be process safe. However, i am not able to verify the behavior.

I am using the following code snippet to verify the behavior:

import multiprocessing
def withdraw(balance):
    for _ in range(10000):
        balance[0] = balance[0] - 1  
def deposit(balance):    
    for _ in range(10000):
        balance[0] = balance[0] + 1
balance = multiprocessing.Array('f',[1000],lock=True) ### Notice lock=True passed explicitly
p1 = multiprocessing.Process(target=withdraw, args=(balance,))
p2 = multiprocessing.Process(target=deposit, args=(balance,))
p1.start()
p2.start()
p1.join()
p2.join()
print("Final balance = {}".format(balance[0]))

Each time I run this code I am seeing different final results owing to race conditions affecting the run.

Can you please help me understand what I am doing and/or understanding wrong. As per my understanding, the code snippet I posted should always print 1000


Solution

  • The lock is not as atomic as you think. It only wraps each individual read or write. I reckon, in the gap between reading the current value and writing the new value, the other process has jumped in and saved a new value. Those changes are being clobbered when this one is saved.

    There is a note documented against Multiprocessing.Value which is insightful...

    Python 3.9 documentation

    If you bind both the read and write into a single transaction, with a get_lock() as described, it will meet your expections...

    def deposit(balance):
        for _ in range(10000):
            with balance.get_lock():
                balance[0] = balance[0] + 1
    
    
    def withdraw(balance):
        for _ in range(10000):
            with balance.get_lock():
                balance[0] = balance[0] - 1
    

    For fun and understanding, you can wrap the for loops in the lock and watch it do all the withdrawals sequentially, then release the lock and do all the deposits sequentially thus:

    def withdraw(balance):
        with balance.get_lock():
            for _ in range(1, 1001):
                balance[0] = balance[0] - 1
                sys.stdout.write(f'w {balance[0]}\n')