I want to run several class instances in parallel and update the instance attributes using Redis. On each class instance, I run a thread in the background to listen to redis for changes. When I run set value 9
in redis, the thread detects the change, but the self.value attribute is not updated.
import time
import threading
import redis
class Foo:
def __init__(self, name: str):
self.name = name
self.redis_client = redis.Redis(host="localhost", port=6379)
self.value = 5
self.update_thread = threading.Thread(target=self.update, daemon=True)
self.update_thread.start()
def update(self):
while True:
new_value = int(self.redis_client.get("value"))
if new_value != self.value:
print("new value detected {}".format(new_value))
self.value = new_value
time.sleep(2)
def run(self):
while True:
print("{}: Value: {}".format(self.name, self.value))
time.sleep(2)
if __name__ == "__main__":
import multiprocessing
foo1 = Foo(name="foo1")
foo2 = Foo(name="foo2")
process1 = multiprocessing.Process(target=foo1.run)
process2 = multiprocessing.Process(target=foo2.run)
process1.start()
process2.start()
console output:
foo1: Value: 5
foo2: Value: 5
new value detected 9
new value detected 9
foo1: Value: 5
foo2: Value: 5
When I research the subject, I come across the information that "each process has its own memory space". But since I do not share data between processes in this case, I cannot understand why the data in the object instances cannot be preserved.
You are sharing data between processes, the main process and the subprocesses you created.
You create a thread in the main process when you instantiate foo
, e.g.:
foo1 = Foo(name="foo1")
You then create a subprocess:
process1 = multiprocessing.Process(target=foo1.run)
...
process1.start()
What this does is essentially pickle foo1
, send that object over the wire to the subprocess that was created, and in that subprocess the instance is unpickled and foo1.run
is called. Now, in your main process, the thread that is running foo1.update
is updating the instance in the main process, but your instance in the subprocess is not being updated, so it always reports the original value.
So try adding something like:
if __name__ == "__main__":
import multiprocessing
foo1 = Foo(name="foo1")
foo2 = Foo(name="foo2")
process1 = multiprocessing.Process(target=foo1.run)
process2 = multiprocessing.Process(target=foo2.run)
process1.start()
process2.start()
while True:
print(f"In the main process, {foo1.value=}")
time.sleep(10)
Perhaps, it is important to understand what a method is.
>>> class Foo:
... def bar(self):
... print("hi from", self)
...
When accessed on the class, a method is just a function!:
>>> Foo.bar
<function Foo.bar at 0x10120cd30>
>>> Foo.bar()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Foo.bar() missing 1 required positional argument: 'self'
But if you access it on an instance (through the magic of the descriptor protocol, read this whole thing but pay attention to this section) it returns a bound method object. That is basically a special callable object that passes the instance to the first positional argument (the name self
is just a convention, you could call it banana
if you want).
>>> foo = Foo()
>>> bound_method = foo.bar
>>> bound_method
<bound method Foo.bar of <__main__.Foo object at 0x10121beb0>>
>>> bound_method.__func__
<function Foo.bar at 0x10120cd30>
>>> bound_method.__self__
<__main__.Foo object at 0x10121beb0>
>>> bound_method()
hi from <__main__.Foo object at 0x10121beb0>
So the argument so multiprocessing.Process(target=...)
will be pickled, and this argument is a bound-method object, which is basically a reference to the function (the method!) and the instance.