So, I am trying to create a query that counts the numbers of Redirects per week per Site. I made the below query which gives me almost the correct results, but if there are no Redirects, there are no count results (meaning no 0, no None, just no result). I have tried putting the coalese outside the count, but same result.
How can I make this query give me None or 0 when there are no counts?
date_series = db.func.generate_series(min_date , todays_date, timedelta(weeks=1))
trunc_date = db.func.date_trunc('week', date_series)
subquery = db.session.query(trunc_date.label('week')).subquery()
query = db.session.query(subquery.c.week, Site, db.func.count(db.func.coalesce(Redirect.id, 0)))\
.outerjoin(Site, subquery.c.week == db.func.date_trunc('week', Redirect.timestamp))\
.outerjoin(Site, Redirect.site_id == Site.id)\
.group_by(subquery.c.week, Site.id)
counts = query.all()
Edit:
So I watered down my query even more, removing Site from the grouping, swapped 'week' with 'day', and instead trying to get desired result for only one Site. However, it still doesn't work and it gives me results as if I was doing a join, and not an outerjoin. I really don't understand why it won't give me 0 count results.
See updated query:
query = db.session.query(subquery.c.day, db.func.count(Redirect.id))\
.outerjoin(Redirect, subquery.c.day == db.func.date(Redirect.timestamp))\
.filter(Redirect.site_id == 82)\
.group_by(subquery.c.day)
It still gives me results for only the days where there are values:
0:(datetime.datetime(2022, 11, 23, 0, 0), 1)
1:(datetime.datetime(2023, 12, 7, 0, 0), 1)
And the generated SQL:
SELECT anon_1.day AS anon_1_day, count(redirect.id) AS count_1
FROM (SELECT date_trunc(%(date_trunc_1)s, generate_series(%(generate_series_1)s, %(generate_series_2)s, %(generate_series_3)s)) AS day) AS anon_1 LEFT OUTER JOIN redirect ON anon_1.day = date(redirect.timestamp)
WHERE redirect.site_id = %(site_id_1)s GROUP BY anon_1.day
I believe I figured it out.
The initial subquery is equivalent to the one I showed in my question, except I changed week to day.
Now looks like this:
date_series = db.func.generate_series(min_date , todays_date, '1 day')
trunc_date = db.func.date_trunc('day', date_series)
subquery = db.session.query(trunc_date.label('day')).subquery()
Then I had to do a second subquery to generate the crossproduct between the dates in the first subquery and Site, meaning all possible combinations of the two. I was not able to make the cross product in the initial subquery as it kept complaining that it couldn't make the join.
So here is the cross product query:
from sqlalchemy import true
subquery_cross = db.session.query(subquery.c.day.label('day'), Site.id.label('site_id'))\
.join(Site, true())\
.group_by(subquery.c.day, Site)\
.subquery()
Then I could use this to join in the count of Redirects by outerjoining on both date and site_id simultanously.
The query looks like this:
query = db.session.query(subquery_cross.c.day, subquery_cross.c.site_id, db.func.count(Redirect.id))\
.outerjoin(Redirect, (subquery_cross.c.day == db.func.date(Redirect.timestamp)) & (subquery_cross.c.site_id == Redirect.site_id))\
.group_by(subquery_cross.c.day, subquery_cross.c.site_id)
counts = query.all()
This now gives me counts for each day, and each Site, with 0's where no Redirect records.