Search code examples
pythondjangodjango-querysetdjango-aggregationdjango-annotate

Django query to Sum the lengths of ArrayFields


I have this model:

class Interaction(models.Model):
    user = models.ForeignKey(User)
    codes = ArrayField(models.CharField(choices=CODE_CHOICES))

and I'm trying to figure out how to do the equivalent of this SQL query in Django:

select user_id, sum(cardinality(codes)) from interaction group by user_id;
  • I tried extra(select={codes_len':'cardinality(codes)'}), but you cannot annotate or aggregate over an extra field.
  • I tried annotate(Sum("cardinality('codes')")), but cardinality('codes') isn't a field on the model.
  • I investigated writing a custom aggregate field that combined Sum and cardinality, but that looked... brittle.
  • I discovered in the docs that __len does work properly on an ArrayField, but not in the context of annotate(Sum('codes__len')).
  • I ruled out raw SQL because there are a lot of WHERE statements (omitted here) which make this query difficult to reconstruct by hand.

At this point I think I have no other choice but to add a field to the model that is the length of the codes field and mess with save() to keep it in sync.

Is there really no other way? Am I missing something?


Solution

  • It turns out that the custom aggregate function is the way to go!

    With the following:

    from django.db.models import Aggregate
    
    class SumCardinality(Aggregate):
        template = 'SUM(CARDINALITY(%(expressions)s))'
    

    The query is as simple as:

    Interaction.objects().filter(xxx).\
        values('user_id').annotate(codes_len=SumCardinality('codes'))