Search code examples
pythonsqldjangoquery-optimization

How do I optimize the SQL for this Django view?


I'm writing a Django view that queries a list of objects from a database along with some info from a related field. The issue is that the database is doing 10 queries on 7 objects. Here's the code:

Template code:

{% for resource in results %}
    <li>
        <a href="{{resource.url}}"> {{resource.url}}
                 {{resource.stats_set.values.0.avg}}</a>
<!-- truncated, the rest of the template just generates a form and 
 isn't causing any unnecessary sql -->
{% endfor %}

View code:

class ResourceList(generic.ListView):
    model = Submissions
    context_object_name = 'results'
    template_name = 'url_list.html'

And the necessary models:

class Submissions(models.Model):
    LEVEL = (
        ('Introductory', 'Introductory'),
        ('Intermediate', 'Intermediate'),
        ('Academic', 'Academic'),
    )

    MEDIA_TYPE = (
        ('HTML', 'HTML'),
        ('PDF', 'PDF'),
        ('Video', 'Video'),
        ('Other', 'Other'),
    )
    id = models.AutoField(primary_key=True)
    url = models.URLField(unique=True)
    tags = models.ForeignKey(MajorTags, default=0)
    level = models.CharField(choices=LEVEL, max_length=25)
    media_type = models.CharField(choices=MEDIA_TYPE, max_length=25)

    def __unicode__(self):
        return self.url

    def __str__(self):
        return self.url

class Stats(models.Model):
    url = models.ForeignKey(Submissions)
    id = models.AutoField(primary_key=True)
    avg = models.FloatField(default=0)
    std_dev = models.FloatField(default=0)

    def __unicode__(self):
        return self.url.url + " " + str(self.avg)

    def __str__(self):
        return self.url.url + " " + str(self.avg)

The root of the issue is the {{resource.stats_set.values.0.avg}} bit. The problem is that if I try to do queryset = Submissions.objects.all().select_related('avg') I get an error saying the only option is for select_related() is tags which makes sense because that's the only field in the Submissions model that has a foreign key. My database structure is fairly well locked in at this point, so I can't change the database. I also can't query Stats and do a reverse from there because I need to get data that doesn't have stats yet.

How do I reduce the number of SQL queries I'm running?


Solution

  • You should use prefetch_related which is used to make efficient queries concerning reverse foreign keys or many-to-many relationships:

    From the docs:

    prefetch_related()

    Returns a QuerySet that will automatically retrieve, in a single batch, related objects for each of the specified lookups.

    def get_queryset(self,*args,**kwargs):
         return Submissions.objects.all().prefetch_related('stats_set')
    

    It will require an additional query, but the idea is that it will get all of the related information at once, instead of once per Submissions object.

    On a side note it's recommended to use the singular of a noun for the model class name: Submission instead of Submissions.