Search code examples
pythondjangodjango-modelsmodelforeign-keys

Better way to access a nested foreign key field in django


Consider the following models

class A(models.Model):
  id
  field1
  field2

class B(models.Model):
  id
  field3
  field_a (foreign_key to Class A)

class C(models.model):
  id
  field4
  field5
  field_b (foreign_key to Class B)

  @property
  def nested_field(self):
    return self.field_b.field_a

Now here that property in class C, would trigger additional SQL queries to be fetched. Is there an optimized or better way to get nested foreign key fields?

I have basically tried searching and finding regarding this and couldn't find a better solution, that addresses this problem.


Solution

  • Turning my comments into an answer: select_related() is one of the go-to tools. As you noticed, the model instance needs to have been fetched by a queryset that has made the appropriate call to select_related().

    queryset = C.objects.select_related('b__a')
    obj = queryset.first()
    print(obj.nested_field)  # Shouldn't cost additional queries
    

    The main reason I turned this into an answer is to mention a tradeoff: select_related() does the equivalent of SELECT * on the related model(s). It's convenient when you want to have the related model instance available, and not much of a problem for most small models; however, if you have a related model with a lot of columns, or several distantly nested models, this can cause unnecessary overhead if you don't need to use all the fields that are being retrieved.

    You can optimize this using only(), but what I've used more often is annotate(). I feel like annotate() gets overlooked for these kinds of common foreign key traversals, but it can give you a similar interface to using model properties (and it took me way too long into my coding career to figure that out):

    from django.db.models import F
    
    queryset = C.objects.annotate(nested_field_1=F('field_b__field_a__field1'))
    obj = queryset.first()
    print(obj.nested_field_1)  # similar, though it's per-field
    

    annotate() does the equivalent of SELECT AS here, and it can accomplish many of the same things as a Model property. Here, nested_field_1 is available as an attribute on each object in the query result.

    To make this more reusable, these kinds of calls to annotate() can be added to a custom Manager:

    from django.db.models import Manager, Model
    
    class ModelCManager(Manager):
        def get_queryset(self):
            return (
                super().get_queryset()
                .annotate(nested_field_1=F('field_b__field_a__field1'))
            )
    
    class C(Model):
        ...
        objects = Manager()
        with_nested = ModelCManager()
    
    
    queryset = C.with_nested.all()
    obj = queryset.first()
    print(obj.nested_field_1)
    

    In a production application where I had a LOT of normalized, nested foreign keys, I would more frequently extend QuerySet so that I could chain additional methods together:

    from django.db.models import F, Model, QuerySet
    
    class ModelCQuerySet(Queryset):
        def annotate_a_fields(self):
            return self.annotate(
                a_field_1=F('field_b__field_a__field1'),
                a_field_2=F('field_b__field_a__field2')
            )
        
        def annotate_b_fields(self):
            return self.annotate(
                b_field_1=F('field_b__field1')
            )
    
    class C(Model):
        ...
        objects = ModelCQuerySet.as_manager()
    
    
    queryset = (
        C.objects
        .filter(field_b__field4=42)
        .annotate_a_fields()
        .annotate_b_fields()
    )
    obj = queryset.first()
    print(obj.a_field_1)
    

    With the above, you have a lot of control over the interface you create, and it makes the queries involved to get the data you want obvious. Model properties are still super useful for local column operations, like joining strings or formatting values - but I've been burned enough times by surprise 9000+ implicit queries that I avoid defining any model properties that have to traverse a foreign key. Moving those data retrieval concerns to the QuerySet helped guard against some of those accidents.