Search code examples
pythondjangogeneratorlazy-evaluation

Lazy evaluation in generator with Django model objects


I'm trying to understand how to use a Django QuerySet in a Python generator so that it evaluates lazily.

The documentation does not mention generators explicitly, this seems to be the only (more or less) related remark, but it does not clarify my question:

You can evaluate a QuerySet in the following ways:

  • Iteration. A QuerySet is iterable, and it executes its database query the first time you iterate over it.

  • [...]

I have a Django model like this:

class Document(models.Model):
    text = [...]

    @cached_property
    def process(self):
        [...]

Now I try this:

processed = (doc.process for doc in Document.objects.all())

I noticed, however, that this triggers the process() method immediately for all of the objects, which results in exploding memory consumption.

Investigating step by step:

docs = Document.objects.all()
test = (doc for doc in docs)

Document.objects.all() does not trigger any evaluation, it only creates the QuerySet, as expected. However, the second line (test) already loads the whole document set into memory, so the process() call as shown above is apparently not an issue.

It looks to me like creating a generator comprehension from a QuerySet already triggers the Django database call. If that is the case, how can I properly achieve what I wanted initially, namely a generator that is evaluated lazily like this:

(doc.process for doc in Document.objects.all())

Solution

  • It looks like the generator expression does actually account for an 'iteration', causing Django to retrieve all the documents from a QuerySet from the database. To solve this issue, use the iterator() method:

    (doc.process for doc in Document.objects.iterator())