Search code examples
pythondjangoannotations

Django: Annotate the count of the number of duplicate values for each object


I have a model called Product seen below. Products can have the same field 'bc_sku'.

class Product(models.Model)

    bc_sku                          = models.IntegerField(null=True, blank=True)
    product_type                    = models.CharField(null=True, blank=True, max_length=50)
    merchant                        = models.CharField(null=True, blank=True, max_length=50)
    product_price                   = models.DecimalField(null=True, blank=True, max_digits=10, decimal_places=2)

For example, imagine I had this list of objects

    bc_sku | product_type | merchant | product_price
    100    | good         | A        | 1.00
    100    | bad          | B        | 2.00
    100    | bad          | C        | 3.00
    101    | good         | A        | 7.00
    101    | bad          | B        | 5.00

What I'd like to do is create a query which annotates each "good" product with the count of the number of duplicates along with the minimum price for each 'bc_sku'. I would then like to be able to use these objects and values in a template.

    bc_sku | product_type | merchant | dup_count | min_price
    100    | good         | A        | 3         | 1.00
    101    | good         | A        | 2         | 5.00

Any help would be greatly appreciated as I'm struggling to get annotations and filters to make it work currently.


Solution

  • The first wish was to use window function, but unfortunately it is not allowed to combine annotate(Window(...)) and filter(...)

    The answer is:

    from django.db.models import OuterRef, Subquery, Count, Min
    
    subquery = Product.objects.filter(bc_sku=OuterRef('bc_sku')).values('bc_sku')
                              .annotate(dup_count=Count('*'), min_price=Min('product_price'))
    Product.objects.filter(product_type='good')
                   .annotate(dup_count=Subquery(subquery.values('dup_count')), 
                             min_price=Subquery(subquery.values('min_price')))
    

    You can find details on how this query works in docs