Search code examples
pythonpython-3.xmultiprocessingpython-multiprocessingpython-multithreading

How to share data between two processes?


How can I share values from one process with another? Apparently I can do that through multithreading but not multiprocessing. Multithreading is slow for my program.

I cannot show my exact code so I made this simple example.

from multiprocessing import Process
from threading import Thread
import time

class exp:
    def __init__(self):
        self.var1 = 0
            
    def func1(self):

        self.var1 = 5
        print(self.var1)

    def func2(self):

        print(self.var1) 


if __name__ == "__main__":

    #multithreading
    obj1 = exp()
    t1 = Thread(target = obj1.func1)
    t2 = Thread(target = obj1.func2)
    print("multithreading")
    t1.start()
    time.sleep(1)
    t2.start()

    time.sleep(3)


    #multiprocessing
    obj = exp()
    p1 = Process(target = obj.func1)
    p2 = Process(target = obj.func2)

    print("multiprocessing")
    p1.start()
    time.sleep(2)
    p2.start()

Expected output:

multithreading
5
5
multiprocessing
5
5

Actual output:

multithreading
5
5
multiprocessing
5
0

Solution

  • I know there has been a couple of close votes against this question, but the supposed duplicate question's answer does not really explain why the OP's program does not work as is and the offered solution is not what I would propose. Hence:

    Let's analyze what is happening. The creation of obj = exp() is done by the main process. The execution of exp.func1 occurs is a different process/address space and therefore the obj object a must be serialized/de-serialized to the address space of that process. In that new address space self.var1 comes across with the initial value of 0 and is then set to 5, but only the copy of the obj object that is in the address space of process p1 is being modified; the copy of that object that exists in the main process has not been modified. Then when you start process p2, another copy of obj from the main process is sent to the new process, but still with self.var1 having a value of 0.

    The solution is for self.var1 to be an instance of multiprocessing.Value, which is a special variable that exists in shared memory accessible to all procceses. See the docs.

    from multiprocessing import Process, Value
    
    class exp:
        def __init__(self):
            self.var1 = Value('i', 0, lock=False)
    
        def func1(self):
    
            self.var1.value = 5
            print(self.var1.value)
    
        def func2(self):
    
            print(self.var1.value)
    
    
    if __name__ == "__main__":
    
        #multiprocessing
        obj = exp()
        p1 = Process(target = obj.func1)
        p2 = Process(target = obj.func2)
    
        print("multiprocessing")
        p1.start()
        # No need to sleep, just wait for p1 to complete
        # before starting p2:
        #time.sleep(2)
        p1.join()
        p2.start()
        p2.join()
    

    Prints:

    multiprocessing
    5
    5
    

    Note

    Using shared memory for this particular problem is much more efficient than using a managed class, which is referenced by the "close" comment.

    The assignment of 5 to self.var1.value is an atomic operation and does not need to be a serialized operation. But if:

    1. We were performing a non-atomic operation (requires multiple steps) such as self.var1.value += 1 and:
    2. Multiple processes were performing this non-atomic operation in parallel, then:
    3. We should create the value with a lock: self.var1 = Value('i', 0, lock=True) and:
    4. Update the value under control of the lock: with self.var1.get_lock(): self.var1.value += 1