Search code examples
djangodjango-modelsdjango-migrations

In a django custom field, is it invalid to give the constructor a class-type input parameter?


I'm trying to define a custom field in Django, and I wanted to initialize it with a helper object. The field class receives the helper object an init parameter, and stores it internally. Something that would work like this:

# define helper class
class Foo:
   # ...

# instance of helper class
myFoo = Foo()

# model with field accepting helper class
class MyModel(models.Model):
    foo = FooManagedField(foo=myFoo)

This seems to me like a pretty reasonable thing to do.
But in practice, it completely breaks migration tooling.

What I'm seeing happen is this: As soon as any model makes use of this field, Django considers it to always be in an altered state (I assume because it doesn't consider different objects equivalent). Therefore, it registers as always have changes requiring new migrations.

Is there any way to solve this, any way to indicate that the definition is consistent? Or is it a blanket restriction that a custom field can only be initialized with primitives?
Or maybe there's a reason I'm not seeing that passing object instances into a custom Field implementation is a bad idea and I shouldn't want to do it?

Walkthrough and Detail

I'm defining a simple custom Django field, as follows:

class FooManagedField(CharField):
    def __init__(self, foo : Foo, *args, **kwargs) -> None:
        self.foo : Foo = foo
        super().__init__(*args, **kwargs)
    
    def deconstruct(self):
        name, path, args, kwargs = super().deconstruct()
        kwargs["foo"]=self.foo
        return name, path, args, kwargs

Where Foo is a simple object, e.g.:

class Foo:
    def __init__(self, n):
        self.n=n
    def __repr__(self):
        return f"Foo({self.n})"

Now I add this new field type to a model (having written supporting class FooSerializer to support migrations -- supporting code below):

from django.db import models
from .utils.foo_field import Foo, FooManagedField

fooA = Foo(1)
fooB = Foo(2)

class MyModel(models.Model):
    n1 = models.IntegerField() # preexisting field
    n2 = models.IntegerField() # preexisting field
    foo1 = FooManagedField(foo=fooA) # new FooManagedField
    foo2 = FooManagedField(foo=fooB) # new FooManagedField

Running makemigrations produces a migration file, as expected:

    operations = [
        migrations.AddField(
            model_name='mymodel',
            name='foo1',
            field=myproject.utils.foo_field.FooManagedField(foo=Foo(1)),
        ),
        migrations.AddField(
            model_name='mymodel',
            name='foo2',
            field=myproject.utils.foo_field.FooManagedField(foo=Foo(2)),
        ),
    ]

But then, running makemigrations again will produce another migration file. Running it again will produce a third. With this field definition, there is no final state. All the new migrations will look like this:

    operations = [
        migrations.AlterField(
            model_name='mymodel',
            name='foo1',
            field=myproject.utils.foo_field.FooManagedField(foo=Foo(1)),
        ),
        migrations.AlterField(
            model_name='mymodel',
            name='foo2',
            field=myproject.utils.foo_field.FooManagedField(foo=Foo(2)),
        ),
    ]

Question

Is this a hard limitation in Django? Is there a way to provide a class, like I'd like to?

And, is the fact that I want this construction to begin with, a sign I'm making mistakes in my code design? Am I coming at this in the wrong way -- does it not make sense to encapsulate some reusable logic into a helper class that defines the field?

References

  • Infinite AlterField Migrations due to default callable object mismatch is an existing ticket which helped me figure out what the immediate issue was (though I'm still missing the rationale and a workaround, or else to understand why this shouldn't be worked around)
  • non_db_attrs is a relatively new property which lets you mark "attributes of a field that don’t affect a column definition". Is this helpful in any way to this situation? I tried adding this and saw no improvement, but maybe I misunderstood usage.

Supporting code

Foo serializer (reference here):

from django.db.migrations.serializer import BaseSerializer
from django.db.migrations.writer import MigrationWriter

class FooSerializer(BaseSerializer):
    def serialize(self):
        return repr(self.value), {"from myproject.utils.foo_field import Foo"}


MigrationWriter.register_serializer(Foo, FooSerializer)

Solution

  • TL;DR

    Use this code snippet to make it work -

    class FooManagedField(models.CharField):
        def __init__(self, foo: Foo, *args, **kwargs) -> None:
            self.foo: Foo = foo
            super().__init__(*args, **kwargs)
    
        def deconstruct(self):
            name, path, args, kwargs = super().deconstruct()
            kwargs["foo"] = "Foo"  # set a static value
            return name, path, args, kwargs


    Long Answer

    From the doc

    The counterpoint to writing your __init__() method is writing the deconstruct() method. It’s used during model migrations to tell Django how to take an instance of your new field and reduce it to a serialized form - in particular, what arguments to pass to __init__() to recreate it.

    If you haven’t added any extra options on top of the field you inherited from, then there’s no need to write a new deconstruct() method. If, however, you’re changing the arguments passed in __init__() (like we are in HandField), you’ll need to supplement the values being passed.

    Simply put, deconstruct() determines whether Django needs to generate a new migration file whenever the makemigrations command is called.

    Here, the foo argument doesn't take part in the DB migration, and hence, we don't need to do anything to the deconstruct(...) method. In other words, the following example is enough

    class FooManagedField(models.CharField):
    
        def __init__(self, foo: Foo, *args, **kwargs) -> None:
            self.foo: Foo = foo
            super().__init__(*args, **kwargs)
    
    

    But, we will get the following error,

    TypeError: Couldn't reconstruct field foo1 on polls.MyModel: init() missing 1 required positional argument: 'foo'

    This is because we defined the foo argument as mandatory/required; and thus we need to supply some value for the same from the deconstruct(...)

    Now, we need to override the deconstruct(...) method by providing value for foo to generate migration file -

    class FooManagedField(models.CharField):
    
        def __init__(self, foo: Foo, *args, **kwargs) -> None:
            self.foo: Foo = foo
            super().__init__(*args, **kwargs)
    
        def deconstruct(self):
            name, path, args, kwargs = super().deconstruct()
            kwargs["foo"] = self.foo # Note that `self.foo` is a class object
            return name, path, args, kwargs

    Unfortunately, this will again raise an error -

    ValueError: Cannot serialize: Foo(1)

    and this is because the Foo object is a custom class and Django doesn't know how to serialize them.

    Setting the value with a built-in type (int, str, float, etc) will solve the issue.

    class FooManagedField(models.CharField):
    
        def __init__(self, foo: Foo, *args, **kwargs) -> None:
            self.foo: Foo = foo
            super().__init__(*args, **kwargs)
    
        def deconstruct(self):
            name, path, args, kwargs = super().deconstruct()
            kwargs["foo"] = str(self.foo)
            return name, path, args, kwargs

    But, this also has some issues, which is whenever the value of self.foo changes, a new migration file will be generated (not automatically, but upon the makemigrations command).

    So, we are setting the value to a static value inside deconstruct(...) method -

    class FooManagedField(models.CharField):
        def __init__(self, foo: Foo, *args, **kwargs) -> None:
            self.foo: Foo = foo
            super().__init__(*args, **kwargs)
    
        def deconstruct(self):
            name, path, args, kwargs = super().deconstruct()
            kwargs["foo"] = "Foo"  # set a static value
            return name, path, args, kwargs


    Note: As per the documentation, the non_db_attrs attribute should work, but it doesn't. I'll make some tests and raise a ticket if required.