Search code examples
pythonpython-multithreading

Python: sharing class variables across threads


I have a counter (training_queue) shared among many instances of a class. The class inherits threading.Thread, so it implements a run() method. When I call start(), I expect each thread to increment this counter, so when it reaches a limit no more threads are started. However, none of the threads modifies the variable. Here's the code:

class Engine(threading.Thread):

    training_mutex = threading.Semaphore(MAX_TRAIN)
    training_queue = 0
    analysis_mutex = threading.Semaphore(MAX_ANALYSIS)
    analysis_queue = 0
    variable_mutex = threading.Lock()


    def __init__(self, config):
        threading.Thread.__init__(self)
        self.config = config
        self.deepnet = None
        # prevents engine from doing analysis while training
        self.analyze_lock = threading.Lock()

    def run(self):
        with self.variable_mutex:
            self.training_queue += 1
        print self.training_queue
        with self.training_mutex:
            with self.analyze_lock:
                self.deepnet = self.loadLSTM3Model()
   

I protect the training_queue with a Lock, so it should be thread-safe. How ever, if I print its value its always 1. How does threading affect variable scope in this case?


Solution

  • Your understanding of how state is shared between threads is correct. However, you are using instance attribute "training_queue" instead of class attribute "training_queue".

    That is, you always set training_queue to 1 for each new object.

    For example:

    import threading
    
    class Engine(threading.Thread):
        training_queue = 0
        print_lock = threading.Lock()
    
        def __init__(self, config):
            threading.Thread.__init__(self)
    
        def run(self):
            with Engine.print_lock:
                self.training_queue += 1
                print self.training_queue
    
    Engine('a').start()
    Engine('b').start()
    Engine('c').start()
    Engine('d').start()
    Engine('e').start()
    

    Will return:

    1
    1
    1
    1
    1
    

    But:

    import threading
    
    class Engine(threading.Thread):
        training_queue = 0
        print_lock = threading.Lock()
    
        def __init__(self, config):
            threading.Thread.__init__(self)
    
        def run(self):
            with Engine.print_lock:
                Engine.training_queue += 1  # <-here
                print self.training_queue
    
    Engine('a').start()
    Engine('b').start()
    Engine('c').start()
    Engine('d').start()
    Engine('e').start()
    

    Returns:

    1
    2
    3
    4
    5
    

    Note self.training_queue vs Engine.training_queue

    btw. I think += in python should be atomic so I wouldn't bother with the lock. However, not the usage of lock for printing to stdout in the example above.