I have a table of companies and a table of transcripts that are related to those companies — many-to-one.
I want to be able to list companies that have not been checked recently AND do not have transcripts published recently.
The problem I'm running into is when I join the tables, I can't figure out how to only limit the second condition to the most recent transcript. My query is unintentionally returning companies that have recent transcripts if it detects a transcript that is older.
Company Table
id | name | last_checked_at |
---|---|---|
1 | ACME | 2022-10-11 02:50:52.184975+00 |
2 | MeepMeep | 2022-05-12 02:50:52.184975+00 |
3 | TNT | 2022-05-12 02:50:52.184975+00 |
Transcripts Table
id | company | published_at |
---|---|---|
5 | 1 | 2022-10-11 02:50:52.184975+00 |
6 | 2 | 2022-10-11 02:50:52.184975+00 |
7 | 2 | 2022-05-12 02:50:52.184975+00 |
8 | 3 | 2022-06-11 02:50:52.184975+00 |
9 | 3 | 2022-03-12 02:50:52.184975+00 |
Desired Logic
company.id
scompany.last_checked_at
is older than 1 week from todaytranscript.published_at
is older than 3 months from todayExpected Behavior
Using the data in the tables above:
last_checked_at
within 7 days of todaylast_checked_at
is greater than 7 days, since the most recent transcript was published_at
within 3 months of todaylast_checked_at
is greater than 7 days and the most recent transcript was published_at
greater than 3 months of todayAttempts
SELECT * FROM summaries s LEFT OUTER JOIN companies c ON s.company = c.id WHERE s.published_at < now() - INTERVAL '3 months' ORDER BY s.published_at ASC limit 1
We can try using DISTINCT ON
on the transcripts table:
WITH cte AS (
SELECT DISTINCT ON (company) *
FROM summaries
WHERE published_at < NOW() - INTERVAL '3 months'
ORDER BY company, published_at DESC
)
SELECT c.name, c.last_checked_at, s.published_at
FROM companies c
INNER JOIN cte s
ON s.company = c.id
WHERE c.last_checked_at < NOW() - INTERVAL '1 week';