I'm refactoring some SQL to remove hardcoded dates that exist within several OR conditions, as part of an INSERT INTO statement.
For example:
OR ( site = 'site.com' AND url = 'www.someurl.com' AND date < '2021-01-01')
OR ( site = 'site.com' AND url = 'www.otherurl.com' AND date < '2019-09-09')
site = 'site.com' always stays the same.
url = 'www.someurl.com' will have different URLs
The dates will also change.
In order to remove several lines that exist like the one above, I've created a reference table that contains a list of the site names, a boolean column to mark the interesting urls, and the dates (same as the hardcoded ones)
I've sorted out everything except AND date < '2021-01-01'. How do I select rows from Table A where the url matches the url from my reference table (Table B), where the date is less than the date in the reference table?
I'm down a subquery rabbit hole with the following:
OR (site = 'site.com' AND url IN ( SELECT url_ref
FROM REF_TABLE
LEFT JOIN TABLE A
ON url = url_ref) AND date < ( SELECT start_date
FROM REF_TABLE
LEFT JOIN TABLE A
ON url = url_ref ) )
I was able to eliminate another large OR statement that contained a list of urls, but without dates. I'm stuck on these few cases where I need to only take rows where the date is less than a certain date, as defined in the reference table.
That works like this:
OR (ref.intersting = 't' AND url NOT IN ( SELECT url_ref
FROM REF_TABLE
WHERE url_ref IS NOT NULL ) )
Thank you!
p.s. The purpose f this is so that I can stop editing a python script every time a new exception is created. Instead, I will just add a row to the reference table and the script won't need to change.
You can use EXISTS
:
OR
(outer.site = 'site.com' AND
EXISTS (SELECT 1
FROM REF_TABLE r
WHERE outer.url = r.url AND
outer.date < r.date
)
)
outer
is the table alias for the columns in the outer query.