Search code examples
pythondjangodjango-querysetaggregation

Using prefetch_related and aggregations to avoid n+1 issue with Django database queries for model with time series data


I am trying to avoid an obscene amount of database queries in a Django app. In the app I am monotoring a number of suggestions (model: Suggestion) that can be voted for (model: Vote).

The Vote model does not store each individual vote. Instead the total number of votes for a suggestion are stored at regular intervals. A suggestion of "Better ice cream" could have "10 votes at 8:10", "12 votes at 8:20", "25 votes at 8:30", etc.

I have created a very ineffecient loop with some major n+1 issues to calculate the number of new votes per day per suggestion.

I am looking for a more efficient (probably single) queryset than the current ones for the same functionality. I know that I should probably create some kind of annotation by dates of votes on "suggestions" in views.py and then annotate that by my aggregate function that calculates the number of votes on each day, but I cannot figure out how to actually chain this together.

Here's my current working but very inefficient code:

models.py:

class Suggestion(models.Model):
    unique_id = models.CharField(max_length=10, unique=True)
    title = models.CharField(max_length=500)
    suggested_date = models.DateField()
​
class Vote(models.Model):
    suggestion = models.ForeignKey('Suggestion', on_delete=models.CASCADE)
    timestamp = models.DateTimeField()
    votes = models.IntegerField()

views.py:

def index(request):
    # Proces votes per day per suggestion
    suggestions = Suggestion.objects.prefetch_related('vote_set')
    votes_per_day_per_suggestion = {}
    for suggestion in suggestions:
        votes_per_day_per_suggestion[suggestion.title] = {}
        votes = suggestion.vote_set
        suggestion_dates = votes.dates('timestamp', 'day') # n+1 issue
        for date in suggestion_dates:
            date_min_max = votes.filter(timestamp__date=date).aggregate(votes_on_date=(Max('votes') - Min('votes'))) # n+1 issue
            votes_per_day_per_suggestion[suggestion.title][date] = date_min_max['votes_on_date']
    context['votes_per_day_per_suggestion'] = votes_per_day_per_suggestion
    return render(request, 'borgerforslag/index.html', context)

Template output:

Better toilet paper (number of votes per day):
19. october 2021: 23
20. october 2021: 19
21. october 2021: 18
22. october 2021: 9
23. october 2021: 25
24. october 2021: 34
25. october 2021: 216

Solution

  • The following should get you all suggestions, dates and sum of votes in a values queryset

    from django.db.models import Max, Min
    from django.db.models.functions import TruncDate
    
    
    def index(request):
        suggestions = Suggestion.objects.annotate(
            date=TruncDate('vote__timestamp')
        ).order_by(
            'id', 'date'
        ).annotate(
            sum=Max('vote__votes') - Min('vote__votes')
        )
        return render(request, 'borgerforslag/index.html', {'suggestions': suggestions})
    

    Then in template use regroup to group all those results by the suggestion

    {% regroup suggestions by title as suggestions_grouped %}
    
    <ul>
    {% for suggestion in suggestions_grouped %}
        <li>{{ suggestion.grouper }}
        <ul>
            {% for date in suggestion.list %}
              <li>{{ date.date }}: {{ date.sum }}</li>
            {% endfor %}
        </ul>
        </li>
    {% endfor %}
    </ul>