Search code examples
djangodjango-querysetdjango-aggregation

Django aggregation: Group and sum over a date range


I am fetching some user stats daily and recording them in a model as follows (irrelevant parts are stripped off for simplicity):

class User(models.Model):
    group = models.CharField(
        choices=(('A', 'Group A'), ('B', 'Group B'),
    ))

class Stats(models.Model):
    day = models.DateField()
    user = models.ForeignKey(User)
    follower_count = models.PositiveIntegerField()

As seen above, each user belongs to a group.

How can I get the sum each user's follower_counts for each group over a date range?

In other words, what is the best way to build a data structure as follows using these models? Is it possible to do it with a single aggregate query?

[{
    'date': '2015-07-15',
    'Group A': 26,  # sum of followers of users in Group A on 2015-07-15
    'Group B': 15,
 }, {
    'date': '2015-07-16',
    'Group A': 30, 
    'Group B': 18,
 }, {
    'date': '2015-07-17',
    'Group A': 32, 
    'Group B': 25,
 }]

Thank you.


Solution

  • You should be able to get the desired aggregate query by using this block of code.

    Stats.objects.values('day').annotate(
        group_a=Sum(Case(When(user__group='A', then='follower_count'))),
        group_b=Sum(Case(When(user__group='B', then='follower_count')))
    )
    

    Basically it tells the Django ORM to get the sums of the follower_count's of the two groups A and B, and the column aliases will be "group_a" and "group_b" respectively. The aggregation will be performed with a GROUP BY using the 'day' field.

    The resulting queryset will give you the details you want. The rest will be just formatting. You may use the basic JSON serializer Django provides to get the format you want, but if it is for a Web API, you might want to take a look at Django REST Framework, particularly the serializers.