Search code examples
djangodjango-modelsgroup-bydjango-orm

Django: Group by and return multiple fields


In Django, Consider the following model class,

class Report(models.Model):
    owner = models.ForeignKey(
        to=Owner,
        on_delete=models.CASCADE,
        related_name='data',
    )
    balance = models.PositiveIntegerField()
    report_date = models.DateField()

Assume this table has only the following four items,

<QuerySet [{'owner': 1, 'balance': 100, 'report_date': datetime.date(2023, 3, 4)}, {'owner': 1, 'balance': 50, 'report_date': datetime.date(2023, 3, 9)}, {'owner': 2, 'balance': 1000, 'report_date': datetime.date(2023, 2, 2)}, {'owner': 2, 'balance': 2000, 'report_date': datetime.date(2023, 2, 22)}]>

a simple group by owner and aggregating the minimum of report_date will be as follows,

Report.objects.values('owner').annotate(min_report_date=Min('report_date')).values('owner', 'min_report_date')

and the result will be as follows,

<QuerySet [{'owner': 1, 'min_report_date': datetime.date(2023, 3, 4)}, {'owner': 2, 'min_report_date': datetime.date(2023, 2, 2)}]>

Now I want to return the balance corresponding to the minimum of report_date field in addition to the owner and min_report_date fields like bellow,

<QuerySet [{'owner': 1, 'min_report_date': datetime.date(2023, 3, 4), 'balance': 100}, {'owner': 2, 'min_report_date': datetime.date(2023, 2, 2), 'balance': 1000}]>

My attempt was the bellow Django query,

Report.objects.values('owner').annotate(min_report_date=Min('report_date')).values('owner', 'min_report_date', 'balance')

but the result lost the effect of aggregation (i.e., all rows were returned) and was like bellow,

 <QuerySet [{'owner': 1, 'balance': 50, 'min_report_date': datetime.date(2023, 3, 9)}, {'owner': 1, 'balance': 100, 'min_report_date': datetime.date(2023, 3, 4)}, {'owner': 2, 'balance': 2000, 'min_report_date': datetime.date(2023, 2, 22)}, {'owner': 2, 'balance': 1000, 'min_report_date': datetime.date(2023, 2, 2)}]>

Solution

  • You can work with a Subquery expression [Django-doc]:

    from django.db.models import OuterRef, Subquery
    
    Owner.objects.annotate(
        min_report_balance=Subquery(
            Report.objects.filter(owner_id=OuterRef('pk'))
            .order_by('report_date')
            .values('balance')[:1]
        )
    )

    The Owner objects that arise from this queryset will have an extra attribute min_report_balance that is the balance of the "oldest" Report for that owner, or None (NULL) in case no such report exists.