Search code examples
pythonpython-multiprocessing

Sending a complex object (nested classes) over multiprocessing queue


I have two processes, created using the multiprocessing library and I am trying to use a queue to send a complex object between them. The complex object is nested classes, with variables for holding data.

Variables in the parent class make it through the queue "correctly", but variables in a nested class are at thier default values, rather than the ones set by the sending process.

Example code:

import multiprocessing


class ChildData:
    testBool: bool = False
    testInt1: int = 20
    testInt2: int = 6500

class ParentData:
    testBool: bool = False
    childData: ChildData = ChildData()

class Reciever:
    queue: multiprocessing.Queue

    def __init__(self, queue):
        self.queue = queue

    def run(self):
        queueValue = self.queue.get()
        print(queueValue.testBool)
        print(queueValue.childData.testBool)

class Sender:
    queue: multiprocessing.Queue

    def __init__(self, queue):
        self.queue = queue

    def run(self):
        dataToSend = ParentData()
        dataToSend.testBool = True
        dataToSend.childData.testBool = True
        self.queue.put(dataToSend)

queue = multiprocessing.Queue()

reciever = Reciever(queue)
recieverProcess = multiprocessing.Process(
    target = reciever.run
)
recieverProcess.start()
sender = Sender(queue)
senderProcess = multiprocessing.Process(
    target = sender.run
)
senderProcess.start()

I would expect (and desire) the above code to output:

True
True

But instead it outputs:

True
False

I have two questions:

  • What am I doing wrong?/How do I achieve what I want with my given data structure?
  • When I send an object through a queue, am I sending a copy (changes in one process won't affect the object in the other process) or a reference (changes in one process will affect the object in the other process? (I want to send a copy)

Solution

  • You need to stop using class variables if you want instance state. What is to tranfer data in between processes, the queue uses pickle. When you unpickle an object, its instance state is recreated from the pickle, but the class state merely comes from whatever is in the process at the time (basically, it looks for a class with the same name). The object you pickled had no instance attribute childData. Because you never set any. Because in your class definitions you aren't creating __init__'s and instead relying on class variables. So on this line:

    dataToSend.childData.testBool = True
    

    The dataToSend.childData resolves to the object that is *set on the class in ParentData. That is mutated. But the class in the subprocess is not affected when the instance of Parent is unpickled. So when that reference is looked up, it doesn't see any changes that occurred in another process, because pickle doesn't save class state.

    Here is how you should define your class:

    class ChildData:
        # Don't create class variables! only type annotations if you want
        testBool: bool
        testInt1: int
        testInt2: int
        def __init__(self, testBool: bool = False, testInt1: int = 20, testInt2: int = 6500):
            self.testBool = testBool
            self.testInt1 = testInt1
            self.testInt2 = testInt2
    
    class ParentData:
        # no class variables here, only type annotations
        testBool: bool
        childData: ChildData
        def __init__(self, testBool: bool = False, childData: ChildData | None = None) -> None:
            self.testBool = False
            if childData is None:
                self.childData = ChildData()
            else:
                self.childData = childData
    

    So no class variables.

    As an aside, a terminology note: these classes aren't nested. You are using composition, but you always uses composition in Python.