I'm performing a raw query in Django in order to make use of PostgreSQL's generate_series
function and get rows back for each date in an interval, comparing the generated dates with datetime ranges (order_dates
) in the BaseEntry model (something not really possible directly in the ORM).
I'm using an unmanaged model:
class OrderBinnedStat(models.Model):
bin = models.DateTimeField()
id_count = models.IntegerField()
avg_flow = models.IntegerField()
sum_flow = models.IntegerField()
class Meta:
managed = False
And running the following code to query and view the result:
from django.utils import timezone
timezone.activate("UTC")
end_dt = timezone.datetime(year=2021, month=12, day=26, hour=0, minute=0, second=0)
end_dt = timezone.make_aware(end_dt)
start_dt = end_dt - timezone.timedelta(days=1)
minutes = 60
report = OrderBinnedStat.objects.raw("""
SELECT row_number() OVER () AS id,
dt_range.bin,
count(DISTINCT t.id) AS id_count,
avg(DISTINCT t.flow) AS avg_flow,
sum(DISTINCT t.flow) AS sum_flow
FROM (
SELECT generate_series(%s, %s - interval '1 milliseconds', interval '%s min')
FROM public.orders_baseentry t -- is this line needed?
) dt_range(bin)
LEFT JOIN public.orders_baseentry t ON t.order_dates @> dt_range.bin
-- WHERE t.status = ANY(ARRAY['new', 'cancelled'])
GROUP BY dt_range.bin
ORDER BY dt_range.bin;
""", [start_dt, end_dt, minutes])
for item in report:
print(item.id, item.bin, item.id_count, item.avg_flow, item.sum_flow)
It works great when I'm not filtering the baseentry
table. I get 24 rows, as expected (one for each hour of the requested interval). There are no order_dates
that overlap with 2021-12-25 21:00:00+00:00 and later, so the values are all 0
or None
for those rows. Perfect!
1 2021-12-25 00:00:00+00:00 1 0.65128747160000000000 0.6512874716
2 2021-12-25 01:00:00+00:00 1 0.65128747160000000000 0.6512874716
3 2021-12-25 02:00:00+00:00 1 0.65128747160000000000 0.6512874716
4 2021-12-25 03:00:00+00:00 1 0.65128747160000000000 0.6512874716
5 2021-12-25 04:00:00+00:00 1 0.65128747160000000000 0.6512874716
6 2021-12-25 05:00:00+00:00 1 0.65128747160000000000 0.6512874716
7 2021-12-25 06:00:00+00:00 1 0.65128747160000000000 0.6512874716
8 2021-12-25 07:00:00+00:00 1 0.65128747160000000000 0.6512874716
9 2021-12-25 08:00:00+00:00 1 0.65128747160000000000 0.6512874716
10 2021-12-25 09:00:00+00:00 1 0.65128747160000000000 0.6512874716
11 2021-12-25 10:00:00+00:00 1 0.65128747160000000000 0.6512874716
12 2021-12-25 11:00:00+00:00 1 0.65128747160000000000 0.6512874716
13 2021-12-25 12:00:00+00:00 1 0.65128747160000000000 0.6512874716
14 2021-12-25 13:00:00+00:00 1 0.65128747160000000000 0.6512874716
15 2021-12-25 14:00:00+00:00 1 0.65128747160000000000 0.6512874716
16 2021-12-25 15:00:00+00:00 1 0.65128747160000000000 0.6512874716
17 2021-12-25 16:00:00+00:00 1 0.65128747160000000000 0.6512874716
18 2021-12-25 17:00:00+00:00 1 0.65128747160000000000 0.6512874716
19 2021-12-25 18:00:00+00:00 1 0.65128747160000000000 0.6512874716
20 2021-12-25 19:00:00+00:00 1 0.65128747160000000000 0.6512874716
21 2021-12-25 20:00:00+00:00 1 0.65128747160000000000 0.6512874716
22 2021-12-25 21:00:00+00:00 0 None None
23 2021-12-25 22:00:00+00:00 0 None None
24 2021-12-25 23:00:00+00:00 0 None None
But if I uncomment the WHERE
clause to start filtering (or add any other filtering), I get the following result:
1 2021-12-25 00:00:00+00:00 1 0.65128747160000000000 0.6512874716
2 2021-12-25 01:00:00+00:00 1 0.65128747160000000000 0.6512874716
3 2021-12-25 02:00:00+00:00 1 0.65128747160000000000 0.6512874716
4 2021-12-25 03:00:00+00:00 1 0.65128747160000000000 0.6512874716
5 2021-12-25 04:00:00+00:00 1 0.65128747160000000000 0.6512874716
6 2021-12-25 05:00:00+00:00 1 0.65128747160000000000 0.6512874716
7 2021-12-25 06:00:00+00:00 1 0.65128747160000000000 0.6512874716
8 2021-12-25 07:00:00+00:00 1 0.65128747160000000000 0.6512874716
9 2021-12-25 08:00:00+00:00 1 0.65128747160000000000 0.6512874716
10 2021-12-25 09:00:00+00:00 1 0.65128747160000000000 0.6512874716
11 2021-12-25 10:00:00+00:00 1 0.65128747160000000000 0.6512874716
12 2021-12-25 11:00:00+00:00 1 0.65128747160000000000 0.6512874716
13 2021-12-25 12:00:00+00:00 1 0.65128747160000000000 0.6512874716
14 2021-12-25 13:00:00+00:00 1 0.65128747160000000000 0.6512874716
15 2021-12-25 14:00:00+00:00 1 0.65128747160000000000 0.6512874716
16 2021-12-25 15:00:00+00:00 1 0.65128747160000000000 0.6512874716
17 2021-12-25 16:00:00+00:00 1 0.65128747160000000000 0.6512874716
18 2021-12-25 17:00:00+00:00 1 0.65128747160000000000 0.6512874716
19 2021-12-25 18:00:00+00:00 1 0.65128747160000000000 0.6512874716
20 2021-12-25 19:00:00+00:00 1 0.65128747160000000000 0.6512874716
21 2021-12-25 20:00:00+00:00 1 0.65128747160000000000 0.6512874716
WHERE
clause be placed somewhere else?I need to be able to filter the baseentry
table while also keeping the null rows.
Replace WHERE t.status = ANY(ARRAY['new', 'cancelled'])
with AND t.status = ANY(ARRAY['new', 'cancelled'])
, and it works correctly!