Search code examples
djangodatabaseormrecordmode

How can I set a "safe" running "mode" that determines the collective behavior of django model superclasses I created?


Sorry for the poorly worded title. I just don't know how to phrase it succinctly.

UPDATE: Reframing my question

I decided to reframe my question because in responding to a comment, I realized that this question could be asked in a better and more succinct way, by stepping back and giving a more generic example.

I think this question is about how Django/python handles memory under apache (or nginx) and/or threading, (but I don't know enough on either of those topics to know if that's asking question A when the question is really about B, so I hadn't phrased it that way).

I have a model superclass (that all my models derive from) that overrides various methods like Model.save(). But in certain views, I want to change the behavior of all models' overridden save() method. I accomplished this by creating a global variable to change the save "mode". E.g. In an admin edit view, autoupdate is True, but in the load view, it is False.

I'm concerned that using a global variable is problematic in the case where 2 different users are using those 2 different views at the same time. Do they access the same global variable or not?

If they do, what mechanism can I use to change those modes in those 2 different views independently?

A pseudo-code example of the superclass's save override:

# Global variable
auto_update = True

def set_auto_update(val):
    global auto_update
    auto_update = val

class MaintainedModel(Model):
    def save(self, *args, **kwargs):
        if auto_update:
           # E.g. I running under the admin edit view
           do_things_one_way()
        else:
           # I.e. I am running under the load view
           do_things_another_way()

A pseudo-code example of the load script run by the load view:

from maintained_model import set_auto_update

set_auto_update(False)
save_some_model_objects_from_a_file()
set_auto_update(True)
run_buffered_auto_updates_to_save_automated_model_changes()

A pseudo-code example of what happens in the admin edit view:

model_obj = process_form_data_from_edit_interface()
# Assuming global auto_update is it's default (True)
model_obj.save()

So in a nutshell, I want to change the save behavior of all the models at once depending on the view context, and in for example, the load view, it needs to change the behavior during execution (e.g. turn autoupdates off while it loads data and buffer all the autoupdates, then at the end of the script, turn it back on and process the buffer).

If using global variables to do this could lead to problems, what's the "right way" to accomplish this?

Original question with more details/background:

I think I can best ask my question by example. Here is the background:

I have a working Django model superclass I wrote called MaintainedModel. It provides a way to auto-update selected fields in a model class using decorated methods that the developer implements in the derived model class. The execution of those methods is controlled automatically via various overrides in my MaintainedModel superclass, such as an override of the Model.save() method. If a model object ever changes, the "maintained fields" in the derived model are auto-updated. It also provides a means to trigger those updates via changes to related model objects such that updates propagate via a relationship chain.

One problem I had to overcome is that the autoupdate methods can sometimes be complex and take some time, especially if there are multiple triggers of the same model object due to multiple triggering related model changes. In the wild, that extra time is not noticeable and you don't encounter this repeated trigger case, but during a load of a mass quantity of data, it can double or triple the load time.

I worked around this by creating a way to temporarily turn off "auto-updates" during the running of a load script, which causes the triggers of those updates to buffer. After the load completes, the load script can trigger all the unique autoupdates once with a significant time savings.

But while this works, I'm a little concerned that the way I did it could potentially lead to problems under certain (yet to be encountered) circumstances. I used global variables (e.g. auto_updates = True), and my understanding is that that's bad practice.

The reason I'm not concerned about it currently is that we currently do not have a record edit interface. All the data is loaded in one shot. But we do have a data submission validation interface, which turns off auto-updates, and we have plans for an admin record edit interface.

So I have 2 questions:

  1. If a record is edited in an admin edit interface during the time a user is validating a data submission (for which it turns off auto-update), could that cause the admin record edit to not trigger any auto-updates?
  2. If that's true, what is the "right way" to set the running mode so that both operations run in their respective modes (i.e. the admin record edit triggers an autoupdate and the data submission validation does not)?

I essentially need something that is specific to the context under which model objects are being created/edited/deleted. Model objects created during a load script or validation run should have auto-update off but model objects created/edited/deleted via any other means should have auto-update on.


Solution

  • The answer I arrived at solves the problem (allows all calls to save in separate threads to behave differently) and keeps encapsulation so that I don't have to pass arguments to every save call everywhere. I did this, by implementing a context manager.

    I created a class called MaintainedModelCoordinator. It defines the operating mode og the save calls, and I gave it 3 modes: immediate, deferred, and disabled. Any method where I want all save calls under it to behave in a certain way, I do by instantiating a new coordinator and provide it to a with call. And I implemented it as a decorator to make it dead simple. So for example, if I have an upload validation page where I do not want to perform any autoupdates, I just do the following: E.g. validation_view.py

    @MaintainedModel.no_autoupdates()
    def validate_data():
        ...
    

    At its heart, all the decorator does is: E.g. inside the decorator's wrapper method

    disabled_coordinator = MaintainedModelCoordinator(mode="disabled")
    with MaintainedModel.custom_coordinator(disabled_coordinator):
        validate_data()
    

    And the custom_coordinator method is just:

    @contextmanager
    def custom_coordinator(cls, coordinator):
        do_pre_run_things()
        coordinator_stack.append(coordinator)
        try:
            yield
            do_post_run_things()
        finally:
            coordinator_stack.pop()
    

    Then, in my save() override, all I have to do to be able to determine the mode is, grab the current coordinator off the coordinator_stack:

    def save(self):
        coordinator = self.get_coordinator()
        if coordinator.mode == "disabled":
            # E.g. I running under the admin edit view
            do_things_one_way()
        else:
           # I.e. I am running under the load view
           do_things_another_way()
    

    The only trick to avoid threads stepping on one another, is to decide how you save the coordinator_stack variable. You don't want multiple threads pushing and popping on/off the same stack. There may be another way to do it that is better than what I did, but what has worked for me is, I made a class attribute of MaintainedModel that is a threading.local variable, and I initialize it for every thread, as needed.

    There are more details involved, like exception handling, and coordinator mode precedence that I implemented, but all of that is ancillary to the question.

    It keeps code encapsulated because I don't have to edit the arguments to every save() call and modifies the behavior only with decorators. And it modifies all calls to save() under different contexts. You can even have nested contexts that avoid state issues inherent with using global variables.