Search code examples
sqldjangomodel

Group by or distinct for django to remove the row which has the duplicate column


I have table as follows:

Table A

    ID name score
     1 aaa   100
     2 bbb   200
     3 ccc   300
     4 bbb   100
     5 kkk   600
     6 ccc   300

Now name bbb and ccc is duplicated
So, I want to remove then and depict like this,

    ID name score
     1 aaa   100
     2 bbb   200
     3 ccc   300
     5 kkk   600

I found out that django model doesn't have groupby so, I guess need to use annotate or distinct

So my idea is like this below,

    MyModel.objects.annotate(Count('name'))

However it doesn't work as I hope.

Any help or hint?


Solution

  • Distinct results based on a field are not trivial for a database other than in Django at the moment of writing.

    But we can get this done with an Exists subquery:

    from django.db.models import Exists, OuterRef
    
    MyModel.objects.filter(
        ~Exists(
            MyModel.objects.filter(pk__lt=OuterRef('pk'), name=OuterRef('name'))
        )
    )

    We thus only generate a MyModel object in the result, of no other MyModel exists with the same name and a smaller primary key.