Search code examples
sqldjangodjango-querysetdjango-q

Django Optimisation of a queryset with Q objects


I'm using Django 1.8.

I have a queryset that requires a logical "or" and an "and". It gives :

MyModel.objects.filter(
    Q(start__gt=today) | Q(end__lte=today),
    active=True).update(active=False)

As you may understand it, it should take every active MyModel instances which shouldn't have already started, and the one that have finished, and deactivate them.

"start" and "end" are DateFields, and "active" is a boolean.

It works, but it generates a query that is far from being optimised. I'd like to be able to start the query by filtering on "active" state, then check the other two fields, because in my database, I have thousands of entries, but only few of them have active=True. I would say that this boolean test is faster than the comparisons.

I can't reorder the arguments, because the former with the two Q() is a positioned argument while the latter is a name argument, and I can't chain multiple filter() because it generates an "or", and not an "and".

Is there a way to do this?


Solution

  • First of all, the SQL command generated by the Django ORM probably won't have the condition clauses in the same order as your .filter methods. So don't worry about the "optimal" order.

    Second, no matter in which order the clauses appear in the SQL command, the DB engine will optimize the request and generate an execution plan tailored to your data distribution. Any DB engine worth considering keeps some data distribution statistics. If the proportion of active records is actually the best discriminant in this query, then it will be filtered first.