Search code examples
pythonparallel-python

parallel writing to list in python


I got multiple parallel processes writing into one list in python. My code is:

global_list = []
class MyThread(threading.Thread):
    ...
    def run(self):
    results = self.calculate_results()

    global_list.extend(results)


def total_results():
    for param in params:
         t = MyThread(param)
         t.start()
    while threading.active_count() > 1:
        pass
    return total_results

I don't like this aproach as it has:

  1. An overall global variable -> What would be the way to have a local variable for the `total_results function?
  2. The way I check when the list is returned seems somewhat clumsy, what would be the standard way?

Solution

  • 1 - Use a class variable shared between all Worker's instances to append your results

    from threading import Thread
    
    class Worker(Thread):
        results = []
        ...
    
        def run(self):
            results = self.calculate_results()
            Worker.results.extend(results) # extending a list is thread safe
    

    2 - Use join() to wait untill all the threads are done and let them have some computational time

    def total_results(params):
        # create all workers
        workers = [Worker(p) for p in params]
    
        # start all workers
        [w.start() for w in workers]
    
        # wait for all of them to finish
        [w.join() for w in workers]
    
        #get the result
        return Worker.results