Search code examples
sqldjangodjango-modelsormdjango-queryset

Using Django ORM to query a Model with two FKs that are related through an unique_together constraint on a third Model


I am currently facing a challenge in creating a QuerySet that accurately represents the relationships within my Django models. I have a model equipped with two foreign keys, FK1 and FK2. These two foreign keys, when combined, create a tuple representing a classification (FK1, FK2 -> CLS.pk). This relationship is managed and represented through another model, where I store the records corresponding to these foreign keys and the resulting classification (FK1, FK2, CLS.pk).

Defining my question has been a challenge in itself, so I've tried to simplify the relationship to illustrate the core idea of what I'm attempting to achieve here.

class Event(models.Model):
    value = models.DecimalField(decimal_places=2, max_digits=11, default=0.0)
    event_date = models.DateField()
    local = models.ForeignKey(Local, on_delete=models.CASCADE, related_name='event_set')
    origin = models.ForeignKey(Origin, on_delete=models.CASCADE, related_name='event_set')

    class Meta:
        db_table = 'db_event'
        
class Local(models.Model):
    name = models.CharField(max_length=100, unique=True)

    class Meta:
        db_table = 'db_local'
        
class Origin(models.Model):
    name = models.CharField(max_length=100, unique=True)

    class Meta:
        db_table = 'db_origin'
        
class Classification(models.Model):
    name = models.CharField(max_length=100, unique=True)

    class Meta:
        db_table = 'db_classification'
        
class ClassificationPerOriginAndLocal(models.Model):
    local = models.ForeignKey(Local, on_delete=models.CASCADE, related_name='classification_related_set')
    origin = models.ForeignKey(Origin, on_delete=models.CASCADE, related_name='classification_related_set')
    classification = models.ForeignKey(Classification, on_delete=models.CASCADE, related_name='classification_related_set')
    
    class Meta:
        db_table = 'db_classification_per_origin_and_local'
        unique_together = ('local', 'origin',)

Consider an 'Event' Model designed to store records for events occurring at a specific origin and in a specific local. Later on I need to classify each event and this classification is achieved through the tuple 'origin' and 'local'.

This classification system is managed through a third model called 'ClassificationPerOriginAndLocal', where I specify which 'Classification' object corresponds to a given pair of 'origin' and 'local' objects.

In a SQL raw query, I would formulate something like this to obtain the 'Classification.pk':

SELECT
        ev.*,
        clsf.classification_id
    FROM db_event as ev 
    INNER JOIN db_classification_per_origin_and_local as clsf
     ON clsf.local_id = ev.local_id
      AND clsf.origin_id = ev.origin_id

How can I use (if it's possible) the Django ORM to create this query?


Solution

  • After a long time digging through other Stack Overflow threads and the Django documentation, I managed to resolve this issue by using Subquery and OuterRef:

    from django.db.models import OuterRef, Subquery
    
    # Creating the subquery
    subquery = (
        ClassificationPerOriginAndLocal
        .objects
        .filter(local=OuterRef('local'), origin=OuterRef('origin'))
    )
    
    # Constructing the final query with annotation
    final_query = (
        Event
        .objects
        .annotate(classification=Subquery(subquery.values('classification')))
    )
    

    For context, the solution I found ensures that the subquery only returns a singular value per relation (local + origin -> classification). This is enforced by leveraging the unique constraint for local + origin (set using unique_together).

    Moreover, if you need to access specific fields within the classification object, you can achieve this using Django's '__' lookups:

    final_query = (
        Event
        .objects
        .annotate(
            classification=Subquery(subquery.values('classification')),
            classification__name=Subquery(subquery.values('classification__name')
        )
    )