Search code examples
djangoasynchronousmodelsaveoverriding

How to modify a Django Model field on `.save()` whose value depends on the incoming changes?


I have fields in multiple related models whose values are fully derived from other fields both in the model being saved and from fields in related models. I wanted to automate their value maintenance so that they are always current/valid, so I wrote a base class that each model inherits from. It overrides the .save() and .delete().

It pretty much works except for when multiple updates are triggered via changes to a through model of a M:M relationship between models named Infusate and Tracer (the through model is named InfusateTracer). So for example, I have a test that creates 2 InfusateTracer model records, which triggers updates to Infusate:

glu_t = Tracer.objects.create(compound=glu)
c16_t = Tracer.objects.create(compound=c16)
io = Infusate.objects.create(short_name="ti")
InfusateTracer.objects.create(infusate=io, tracer=glu_t, concentration=1.0)
InfusateTracer.objects.create(infusate=io, tracer=c16_t, concentration=2.0)

print(f"Name: {infusate.name}")
Infusate.objects.get(name="ti{C16:0-[5,6-13C5,17O1];glucose-[2,3-13C5,4-17O1]}")

The save() override looks like this:

    def save(self, *args, **kwargs):
        # Set the changed value triggering this update so that the derived value of the automatically updated field reflects the new values:
        super().save(*args, **kwargs)
        # Update the fields that change due to the above change (if any)
        self.update_decorated_fields()

        # Note, I cannot call save again because I get a duplicate exception, so `update_decorated_fields` uses `setattr`:
        # super().save(*args, **kwargs)

        # Percolate changes up to the parents (if any)
        self.call_parent_updaters()

The automatically maintained field updates are performed here. Note that the fields to update, the function that generates their value, and the link to the parent are all maintained in a global returned by get_my_updaters() whose values are from a decorator I wrote applied to the updating functions:

    def update_decorated_fields(self):
        for updater_dict in self.get_my_updaters():
            update_fun = getattr(self, updater_dict["function"])
            update_fld = updater_dict["update_field"]
            if update_fld is not None:
                current_val = None
                # ... brevity edit
                new_val = update_fun()
                setattr(self, update_fld, new_val)
                print(f"Auto-updated {self.__class__.__name__}.{update_fld} using {update_fun.__qualname__} from [{current_val}] to [{new_val}]")

And in the test code example at the top of this post, where InfusateTracer linking records are created, this method is crucial to the updates that are not fully happening:

    def call_parent_updaters(self):
        parents = []
        for updater_dict in self.get_my_updaters():
            update_fun = getattr(self, updater_dict["function"])
            parent_fld = updater_dict["parent_field"]
            # ... brevity edit
            if parent_inst is not None and parent_inst not in parents:
                parents.append(parent_inst)

        for parent_inst in parents:
            if isinstance(parent_inst, MaintainedModel):
                parent_inst.save()
            elif parent_inst.__class__.__name__ == "ManyRelatedManager":
                if parent_inst.count() > 0 and isinstance(
                    parent_inst.first(), MaintainedModel
                ):
                    for mm_parent_inst in parent_inst.all():
                        mm_parent_inst.save()

And here's the relevant ordered debug output:

Auto-updated Infusate.name using Infusate._name from [ti] to [ti{glucose-[2,3-13C5,4-17O1]}]
Auto-updated Infusate.name using Infusate._name from [ti{glucose-[2,3-13C5,4-17O1]}] to [ti{C16:0-[5,6-13C5,17O1];glucose-[2,3-13C5,4-17O1]}]
Name: ti{glucose-[2,3-13C5,4-17O1]}
DataRepo.models.infusate.Infusate.DoesNotExist: Infusate matching query does not exist.

Note that the output Name: ti{glucose-[2,3-13C5,4-17O1]} is incomplete (even though the debug output above it is complete: ti{C16:0-[5,6-13C5,17O1];glucose-[2,3-13C5,4-17O1]}). It contains the information resulting from the creation of the first through record:

InfusateTracer.objects.create(infusate=io, tracer=glu_t, concentration=1.0)

But the subsequent through record created by:

InfusateTracer.objects.create(infusate=io, tracer=c16_t, concentration=2.0)

...(while all the Auto-updated debug output is correct - and is what I expected to see), is not the final value of the Infusate record's name field (which should be composed of values gathered from 7 different records as displayed in the last Auto-updated debug output (1 Infusate record, 2 Tracer records, and 4 TracerLabel records))...

Is this due to asynchronous execution or is this because I should be using something other than setattr to save the changes? I've tested this many times and the result is always the same.

Incidentally, I lobbied our team to not even have these automatically maintained fields because of their potential to become invalid from DB changes, but the lab people like having them apparently because that's how the suppliers name the compounds, and they want to be able to copy/paste them in searches, etc).


Solution

  • The problem here is a misconception over how changes are applied, when they are used in the construction of the new derived field value, and when the super().save method should be called.

    Here, I am creating a record:

    io = Infusate.objects.create(short_name="ti")
    

    That is related to these 2 records (also being created):

    glu_t = Tracer.objects.create(compound=glu)
    c16_t = Tracer.objects.create(compound=c16)
    

    Then, those records are linked together in a through model:

    InfusateTracer.objects.create(infusate=io, tracer=glu_t, concentration=1.0)
    InfusateTracer.objects.create(infusate=io, tracer=c16_t, concentration=2.0)
    

    I had thought (incorrectly) that I had to call super().save() so that when the field values are gathered together to compose the name field, those incoming changes would be included in the name.

    However, the self object, is what is being used to retrieve those values. It doesn't matter that they aren't saved yet.

    At this point, it's useful to include some of the gaps in the supplied code in the question. This is a portion of the Infusate model:

    class Infusate(MaintainedModel):
    
        id = models.AutoField(primary_key=True)
        name = models.CharField(...)
        short_name = models.CharField(...)
        tracers = models.ManyToManyField(
            Tracer,
            through="InfusateTracer",
        )
    
        @field_updater_function(generation=0, update_field_name="name")
        def _name(self):
            if self.tracers is None or self.tracers.count() == 0:
                return self.short_name
            return (
                self.short_name
                + "{"
                + ";".join(sorted(map(lambda o: o._name(), self.tracers.all())))
                + "}"
            )
    

    And this was an error I had inferred (incorrectly) to mean that the record had to have been saved before I could access the values:

    ValueError: "<Infusate: >" needs to have a value for field "id" before this many-to-many relationship can be used.
    

    when I had tried the following version of my save override:

        def save(self, *args, **kwargs):
            self.update_decorated_fields()
            super().save(*args, **kwargs)
            self.call_parent_updaters()
    

    But what this really meant was that I had to test something else other than self.tracers is None to see if any M:M links exist. We can simply check self.id. If it's None, we can infer that self.tracers does not exist. So the answer to this question is simply to edit the save method override to:

        def save(self, *args, **kwargs):
            self.update_decorated_fields()
            super().save(*args, **kwargs)
            self.call_parent_updaters()
    

    and edit the method that generates the value for the field update to:

        @field_updater_function(generation=0, update_field_name="name")
        def _name(self):
            if self.id is None or self.tracers is None or self.tracers.count() == 0:
                return self.short_name
            return (
                self.short_name
                + "{"
                + ";".join(sorted(map(lambda o: o._name(), self.tracers.all())))
                + "}"
            )