I've a project in gRPC where the main.py
spawns grpc servers as subprocesses.
Also in the project I've settings.py
that contains some configurations, like:
some_config = {"foo": "bar"}
In some files (used by different processes) I have:
import settings
...
the value of settings.some_config is read
In the main process I've a listener that updates some_config
on demand, for example:
settings.some_config = new_value
I noticed that while changing settings.some_config
value in the main process, it was not changed in a subprocess that I checked, and remained the old value.
I want that all subprocess would always have the most up-to-date value of settings.some_config
.
A solution I thought about - passing a queue or a pipe to each sub process, and when some_config
changes in the main process, I can send the new data through the queue/pipe to each subprocess.
But how can I alert it to assign new value to settings.some_config
in the subprocess? Should I use a listener in each subprocesses so that when a notification arrives it will do:
settings.some_config = new_value
Would this work? The end goal is to have settings.some_config
value the most up to date across all modules/process without restarting the server. I'm also not sure if it would work since it could be that each module keeps the value of settings.some_config
which was previously imported in its cached memory.
UPDATE
I took on Charchit's solution and adjusted it to my requirements, so we have:
from multiprocessing.managers import BaseManager, NamespaceProxy
from multiprocessing import Process
import settings
import time
def get_settings():
return settings
def run(proxy_settings):
settings = proxy_settings # So the module settings becomes the proxy object
if __name__ == '__main__':
BaseManager.register('get_settings', get_settings, proxytype=NamespaceProxy)
manager = BaseManager()
manager.start()
settings = manager.get_settings()
p = Process(target=run, args=(settings, ))
p.start()
Few questions:
Should an entire module (settings
) be the target of a proxy object? Is it standard to do so?
There is a lot of magic here, for instance, Is the simple answer, to how it works is that now the module settings is a shared proxy object? So when a sub process reads settings.some_config
, it would actually read the value from manager?
Are there any side effects I should be aware of?
Should I be using locks when I change any value in settings in the main process?
The easiest way to do this is to share the module with a manager:
from multiprocessing.managers import BaseManager, NamespaceProxy
from multiprocessing import Process
import settings
import time
def get_settings():
return settings
def run(settings):
for _ in range(2):
print("Inside subprocess, the value is", settings.some_config)
time.sleep(3)
if __name__ == '__main__':
BaseManager.register('get_settings', get_settings, proxytype=NamespaceProxy)
manager = BaseManager()
manager.start()
settings = manager.get_settings()
p = Process(target=run, args=(settings, ))
p.start()
time.sleep(1)
settings.some_config = {'changed': 'value'}
p.join()
Doing so would mean that you don't have to handle informing subprocesses that there is a change in the value, they will just simply know because they are receiving the value from the manager process which handles this automatically.
Output
Inside subprocess, the value is {'foo': 'bar'}
Inside subprocess, the value is {'changed': 'value'}
Firstly, remember that settings.some_config
needs to be set explicitly. This means you can do settings.some_config = {}
but you cannot do settings.some_config['foo'] = "bar"
. If you want to modify a single key then get the latest config, update that, and explicitly set it like below:
temp = settings.some_config
temp['foo'] = 'bar'
settings.some_config = temp
Secondly, to keep the possible changes to your codebase to an absolute minimal, you are reassigning the settings
variable (initially mapped to the settings.py
module object) to the proxy. In the above code, you are doing this inside the __main__
block (so settings is being changed globally). Therefore, any changes made to settings
from main process would automatically be reflected in the other processes accessing the proxy. This is also being partially replicated inside the child processes running function run
. Accessing settings
from inside run
would mean the same as accessing the proxy. However, if you are calling some other function inside run
, (say run2
) which does not take settings
as an argument, and it tries to access settings
, then it will access the imported module instead of the proxy. Example:
def run2():
print("Inside subprocess run2, the value is", settings.some_config)
def run(settings):
for _ in range(2):
print("Inside subprocess run, the value is", settings.some_config)
time.sleep(3)
run2()
Output
Inside subprocess run, the value is {'foo': 'bar'}
Inside subprocess run, the value is {'changed': 'value'}
Inside subprocess run2, the value is {'foo': 'bar'}
If you do not want this, then you simply need to assign the argument as the value of the global variable settings
:
def run2():
print("Inside subprocess run2, the value is", settings.some_config)
def run(shared_settings):
global settings
settings = shared_settings
for _ in range(2):
print("Inside subprocess run, the value is", settings.some_config)
time.sleep(3)
run2()
Any function (inside the subprocess) now accessing settings
would access the proxy.
Output
Inside subprocess run, the value is {'foo': 'bar'}
Inside subprocess run, the value is {'changed': 'value'}
Inside subprocess run2, the value is {'changed': 'value'}
Lastly, if you have many subprocesses running then this might become slow (more connections to manager = less speed). If this bothers you then I recommend you to do it the way you stated in the description — i.e, "passing a queue or a pipe to each sub process". To make sure that the child process updates it's value as soon fast as it can after you pass the value inside the queue, you can spawn a thread inside the subprocess which constantly polls whether a value in the queue exists, and if it does, it updates the process's settings value to the one provided in the queue. Just make sure to run the thread as a daemon, or explicitly agree on an exit condition.
Update
Should an entire module (settings) be the target of a proxy object? Is it standard to do so?
If your question is whether it is safe to do so then yes it is, just keep in mind the things I have outlined in this answer. At the end of the day, module is just another object, and sharing it here makes more sense.
There is a lot of magic here, for instance, Is the simple answer, to how it works is that now the module settings is a shared proxy object? So when a sub process reads settings.some_config, it would actually read the value from manager?
You need to add a couple of lines in the run
function for that to be the case, check the second point in the previous section.
Are there any side effects I should be aware of?
Check previous section.
Should I be using locks when I change any value in settings in the main process?
Not necessary here