Search code examples
pythondjangodjango-orm

How to count a single field of a django queryset with multiple group by?


Let's say I have a queryset qs. I'm grouping by the queryset as follows:

(
    qs.annotate(
        catering_price_enabled=F("outlet__library__settings__sligro_catering_price_enabled"),
    )
    .values("assortment_sync_id", "catering_price_enabled")
    .order_by("assortment_sync_id", "catering_price_enabled")
    .distinct("assortment_sync_id", "catering_price_enabled")
)

And I'm getting something like:

<QuerySet [
   {'assortment_sync_id': '01234', 'catering_price_enabled': False}, 
   {'assortment_sync_id': '01234', 'catering_price_enabled': None}, 
   {'assortment_sync_id': '56789', 'catering_price_enabled': None},
]>

What I'm trying to do is to annotate this queryset so I can eventually filter for values > 1. In other words, each assortment_sync_id can have only value of catering_price_enabled.

If I add .annotate(count=Count("assortment_sync_id")) django raises NotImplementedError: annotate() + distinct(fields) is not implemented. I tried this approach because it obviously works with just one field.

How can I get the expected output below?

<QuerySet [
   {'assortment_sync_id': '01234', 'catering_price_enabled': False, 'count': 2}, 
   {'assortment_sync_id': '01234', 'catering_price_enabled': None, 'count': 2}, 
   {'assortment_sync_id': '56789', 'catering_price_enabled': None, 'count': 1},
]>

Solution

    1. Use .distinct() to find the distinct rows with assortment_sync_id and catering_price_enabled
    2. Create a subquery that filters pk in distinct_queryset from 1., group by assortment_sync_id and counts assortment_sync_id
    3. Create a result queryset by filtering pk in distinct queryset from 1. then use the count queryset from 2. to annotate in the final result.
    qs = qs.annotate(catering_price_enabled=F("outlet__library__settings__sligro_catering_price_enabled"))
    
    distinct_qs = qs.distinct("assortment_sync_id", "catering_price_enabled")
    
    count_qs = (
        qs.filter(
            pk__in=distinct_qs.values("pk"),
            assortment_sync_id=models.OuterRef("assortment_sync_id")
        )
        .values("assortment_sync_id")
        .order_by("assortment_sync_id")
        .annotate(count=models.Count("*"))
        .values("count")
    )
    
    result = (
        qs.filter(pk__in=distinct_qs.values("pk"))
        .annotate(
            count=models.Subquery(count_qs)
        ).values("assortment_sync_id", "catering_price_enabled", "count")
    )