Search code examples
sqldjangodjango-orm

Django select only rows with duplicate field values


suppose we have a model in django defined as follows:

class Literal:
    name = models.CharField(...)
    ...

Name field is not unique, and thus can have duplicate values. I need to accomplish the following task: Select all rows from the model that have at least one duplicate value of the name field.

I know how to do it using plain SQL (may be not the best solution):

select * from literal where name IN (
    select name from literal group by name having count((name)) > 1
);

So, is it possible to select this using django ORM? Or better SQL solution?


Solution

  • Try:

    from django.db.models import Count
    Literal.objects.values('name')
                   .annotate(Count('id')) 
                   .order_by()
                   .filter(id__count__gt=1)
    

    This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:

    dupes = Literal.objects.values('name')
                           .annotate(Count('id'))
                           .order_by()
                           .filter(id__count__gt=1)
    Literal.objects.filter(name__in=[item['name'] for item in dupes])