Search code examples
pythondjangoduplicatesdjango-annotate

Django: get duplicates based on annotation


I want to get all duplicates based on a case insensitive field value.

Basically to rewrite this SQL query

SELECT count(*), lower(name)
FROM manufacturer
GROUP BY lower(name)
HAVING count(*) > 1;

with Django ORM. I was hoping something like this would do the trick

from django.db.models import Count
from django.db.models.functions import Lower

from myapp.models import Manufacturer


qs = Manufacturer.objects.annotate(
    name_lower=Lower('name'),
    cnt=Count('name_lower')
).filter('cnt__gt'=1)

but of course it didn't work.

Any idea how to do this?


Solution

  • you can try it:

    qs = Manufacturer.objects.annotate(lname=Lower('name')
         ).values('lname').annotate(cnt=Count(Lower('name'))
         ).values('lname', 'cnt').filter(cnt__gt=1).order_by('lname', 'cnt')
    

    why should add the order_by ordering-or-order-by:

    the sql query looks like:

    SELECT 
        LOWER("products_manufacturer"."name") AS "lname",
        COUNT(LOWER("products_manufacturer"."name")) AS "cnt"
    FROM "products_manufacturer"
    GROUP BY LOWER("products_manufacturer"."name")
    HAVING COUNT(LOWER("products_manufacturer"."name")) > 1
    ORDER BY "lname" ASC, "cnt" ASC