[EDIT]
I have created an example Django Repl.it playground preloaded with this exact case:
https://repl.it/@mormoran/Django-Building-dynamic-Q-queries-for-related-tables
[/EDIT]
I'm trying to filter objects on a table, based on related objects, but having trouble doing so.
I have a table Run
:
class Run(models.Model):
start_time = models.DateTimeField(db_index=True)
end_time = models.DateTimeField()
Each Run
object has related table RunValue
:
class RunValue(models.Model):
run = models.ForeignKey(Run, on_delete=models.CASCADE)
run_parameter = models.CharField(max_length=50)
value = models.FloatField(default=0)
In RunValue
we store detailed characteristics of a run, called a run_parameter
. Things such as, voltage, temperature, pressure, etc.
For simplicity's sake, let's assume the fields I want to filter on are "Min. Temperature" and "Max. Temperature".
So for example:
Run 1:
Run Values:
run_parameter: "Min. Temperature", value: 430
run_parameter: "Max. Temperature", value: 436
Run 2:
Run Values:
run_parameter: "Min. Temperature", value: 627
run_parameter: "Max. Temperature", value: 671
Run 3:
Run Values:
run_parameter: "Min. Temperature", value: 642
run_parameter: "Max. Temperature", value: 694
Run 4:
Run Values:
run_parameter: "Min. Temperature", value: 412
run_parameter: "Max. Temperature", value: 534
(RunValue.value are floats, but let's keep it to ints for simplicity).
I have two HTML inputs in my page where the user enters min and max (for temperatures). They can both be left blank, or just one, or both, so it's an open ended filter, it can define a range to filter on, or not. For example, if the user were to input:
Min. temperature = 400
Max. temperature = 500
That set of filters should only return Run 1 from the above Run
instance examples, where the lower threshold is above 400 and the upper threshold is below 500. All other Run
do not qualify.
So then I need to return all Run
object instances where RunValue
matches the filters the user has input.
This is what I have tried:
# Grabbing temp ranges from request and setting default filter mins and maxs:
temp_ranges = [0, 999999] # Defaults in case the user does not set anything
if min_temp_filter:
temp_ranges = [min_temp_filter, 999999]
if max_temp_filter:
temp_ranges = [0, max_temp_filter]
if min_temp_filter and max_temp_filter:
temp_ranges = [min_temp_filter, max_temp_filter]
# Starting Q queries
temp_q_queries = [
Q(runvalue__run_parameter__icontains='Min. Temperature'),
Q(runvalue__run_parameter__icontains='Max. Temperature')
]
queryset = models.Q(reduce(operator.or_, temp_q_queries), runvalue__value__range=temp_ranges)
filtered_run_instances = Run.objects.filter(queryset)
Running that yields some results, but not the desired results. It returns Run 1 and Run 4, when it should only return Run 1.
The temp_ranges
being from 400 to 500, Run 1 qualifies, but the max temperature of Run 4 is over 500, it should NOT qualify. The filters need to exclude object instances by looking at both ranges at the same time, the minimum AND the maximum.
The printed query is as follows:
(AND: (OR: ('runvalue__run_parameter__icontains', 'Min. Temperaure'), ('runvalue__run_parameter__icontains', 'Max. Temperature')), ('runvalue__value__range', ['400', '500']))
What I think I need to be filtering in pseudo code:
All Runs that have RunValue instances where the RunValue.run_parameter is either "Min. Temperature" OR "Max. Temperature" AND the RunValue.value are between 400 and 500
.
I then thought I should be including the value ranges in the Q queries as a regular Django filter, separated by commas:
temp_q_queries = [
Q(runvalue__run_parameter__icontains='Min. Temperature', runvalue__value__range=temp_ranges),
Q(runvalue__run_parameter__icontains='Max. Temperature', runvalue__value__range=temp_ranges)
]
queryset = models.Q(reduce(operator.or_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)
Same result, so the value range is not the issue there, it's the logic grouping (I think?).
So I tried to do two reduce Q queries (a bit gnarly looking), so as to say:
All Runs that have RunValue instances where the name is "Min. Temperature" AND the values are higher than 400, AND all Runs that have RunValue instances where the name is "Max. Temperature" AND the values are lower than 500
temp_q_queries = [
models.Q(reduce(operator.and_, [Q(runvalue__run_parameter__icontains='Min. Temperature'), Q(runvalue__value__gte=temp_ranges[0])]),
models.Q(reduce(operator.and_, [Q(runvalue__run_parameter__icontains='Max. Temperature'), Q(runvalue__value__lte=temp_ranges[1])]))
]
queryset = models.Q(reduce(operator.and_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)
(Note all 3 reduce
where changed to AND gates)
This yielded 0 hits.
Using the same compound reduce method for temp_q_queries
but changing the outter logic gate for queryset
to OR yields the same wrong results, Run 1 and Run 4:
queryset = models.Q(reduce(operator.or_, temp_q_queries))
filtered_run_instances = Run.objects.filter(queryset)
Perhaps I am over complicating myself here and there's something very simple I'm not seeing (I've been trying to solve this logic puzzle for 2 days now, getting a bit of tunnel vision. But I'm rather hoping it's solvable, and simple.
Any help or questions would be greatly appreciated.
Your problem was that you need to meet both conditions and they are never valid both on the same row of RunValue related table. You want to select root objects that have a row "Min. Temperature" in that range and similarly also a "Max. Temperature" valid row. You must use subqueries.
The best is to use Django 3.0 Exists() subquery condition. It can be easily customized for an old Django.
A concrete example:
from django.db.models import Exists, OuterRef
queryset = Run.objects.filter(
Exists(RunValue.objects.filter(
run=OuterRef('pk'),
run_parameter='Min. temperature',
value__gte=400)),
Exists(RunValue.objects.filter(
run=OuterRef('pk'),
run_parameter='Max. temperature',
value__lte=500)),
)
The same by a general solution, because you want a dynamic filter:
filter_data = {
'Min. temperature': 400,
'Max. temperature': 500,
}
param_operators = {
'Min. Temperature': 'gte',
'Max. Temperature': 'lte',
# much more supported parameters... e.g. 'some boolean as 0 or 1': 'eq'.
}
conditions = []
for key, value in filter_data.items():
if value is not None:
conditions.append(Exists(RunValue.objects.filter(
run=OuterRef('pk'),
run_parameter=key,
**{'value__{}'.format(param_operators[key]): value}
)))
queryset = Run.objects.filter(*conditions)
You know that "Min. Temperature" <= "Max. Temperature", but the database optimizer doesn't know it. I optimized it by removing a superfluous condition of the range. It is also better to completely remove a useless condition "Max. Temperature" <= 999999.
This answer can be easily customized for Django >=1.11 <= 2.2 Exists() condition after you read approximately tens lines of that documentation.
You don't need a Q()
object in this simple case, even if you want to rewrite it by short one-line expressions and add mnemonic temporary variables.
EDIT The concrete example can be rewritten for Django < 3.0 this way
queryset = Run.objects.annotate(
min_temperature_filter=Exists(RunValue.objects.filter(
run=OuterRef('pk'),
run_parameter='Min. temperature',
value__gte=400)),
max_temperature_filter=Exists(RunValue.objects.filter(
run=OuterRef('pk'),
run_parameter='Max. temperature',
value__lte=500)),
).filter(
min_temperature_filter=True,
max_temperature_filter=True,
)