Search code examples
pythonpython-2.7python-multithreading

Does python threading.Lock() lock everything that needs locking?


Script below is abstracted. My question is about the use of threading.Lock()

Locking limits access to "shared resources" but I am nervous about how far that goes. I have objects attributes that are lists of objects which have attributes that are arrays in this example. In some cases the dependency will go farther.

Does Lock() "know" for sure about everything that needs to be locked?

The output of the script below is also shown. The purpose of the script is mostly for discussion - It doesn't fail, but I am not confident that it is Locking everything it needs to.

start:   [array([0, 1]), array([0, 1, 2]), array([0, 1, 2, 3])]
append an object
done!
finish:  [array([505, 605]), array([10, 11, 12]), array([10, 11, 12, 13]), array([5])]


import time
from threading import Thread, Lock
import numpy as np

class Bucket(object):
    def __init__(self, objects):
        self.objects = objects

class Object(object):
    def __init__(self, array):
        self.array = array

class A(Thread):
    def __init__(self, bucket):
        Thread.__init__(self)
        self.bucket      = bucket
    def run(self):
        nloop            = 0
        locker           = Lock()
        n = 0
        while n < 10:
            with locker:  
                objects = self.bucket.objects[:]  # makes a local copy of list each time
            for i, obj in enumerate(objects):
                with locker:
                    obj.array += 1
                time.sleep(0.2)
            n += 1
            print 'n: ', n
        print "done!"
        return

objects = []
for i in range(3):
    ob = Object(np.arange(i+2))
    objects.append(ob)
bucket = Bucket(objects)

locker           = Lock()

a = A(bucket)
print [o.array for o in bucket.objects]

a.start()

time.sleep(3)

with locker:
    bucket.objects.append(Object(np.arange(1)))  # abuse the bucket!
    print 'append an object'
time.sleep(5)

print [o.array for o in bucket.objects]

Solution

  • you seem to misunderstand how a lock works.

    a lock doesn't lock any objects, it can just lock the thread execution.

    The first thread which tries to enter a with locker: block succeeds.

    If another thread tries to enter a with locker: block (with the same locker object), it's delayed until the first thread exits the block, so both threads cannot change the value of the variable inside the block at the same time.

    Here your "shared resources" are the variables you're changing in your blocks: as I can see, objects and obj.array. You're basically protecting them from concurrent access (that is - in a python version where there isn't a GIL for starters) just because only one thread can change them at a time

    Old-timers call that a critical section, where only 1 thread can execute at a time.

    Note that it's slightly dubious to use the same locker object for different resources. This has more chance to deadlock / be slower that it needs to be.

    (and if you nest 2 with locker calls you get a deadlock - you need an RLock if you want to do that)

    That's as simple as that.