Search code examples
djangofilterormjoinprefetch

How to limit records in tables with n to n relation before joining them , to avoid duplicates in django ORM?


Extending my previous question on stack-overflow. I have four tables:

A <--- Relation ---> B ---> Category

(So the relation between A and B is n to n, where the relation between B and Category is n to 1) Relation stores 'Intensity' of A in B. I need to calculate the intensity of A in each Category and find the Maximum result. It is achievable using:

A.objects.values(
    'id', 'Relation_set__B__Category_id'
).annotate(
     AcIntensity=Sum(F('Relation_set__Intensity'))
).aggregate(
     Max(F('AcIntensity'))
)['AcIntensity__max']

Now I need to filter the intensities based on some fields in B beforhand:

A.objects.values(
    'id', 'Relation_set__B__Category_id'
).filter(
    Relation_set__B__BType=type_filter
).annotate(
    AcIntensity=Sum(F('Relation_set__Intensity'))
).aggregate(
    Max(F('AcIntensity'))
)['AcIntensity__max']

However I need to avoid duplication resulted due to table join which messes the calculation up.(beside those field to define filtration, I do not need any fields in B)
Is there a way to achieve this using Django ORM?

Update
I think what I need is to limit the records in Relation table (based on B filters) before querying the database. How can I do that?
(Maybe using Prefetch_related and Prefetch?)


Solution

  • Finally I've done it using conditional aggregation.
    You could find more details in this stack-overflow post. So the final code would be:

    result = A.objects.values(
        'id', 'Relation_set__B__Category_id'
    ).annotate(
        AcIntensity=Sum(
            Case(
                When(
                   q_statement_for_filter,then=F('Relation_set__Intensity'),
                ),
                default=0,
                output_field=FloatField()
            )
        )
    ).exclude(
        AcIntensity=0
    ).aggregate(
        Max(F('AcIntensity'))
    )['AcIntensity__max']
    

    Notice that 'q_statement_for_filter' cannot be an empty Q().