I have a single table of data with an ID and a EventDate column The date records are from 01/01/1960 to date. I need to know how many IDs do not have a date record/recorded after 01/01/2022
My theory on this is to find any ID with a date after 01/01/2022, and then exclude these IDs, to leave the IDs with NO date
However, I'm not sure how to exclude within the same table using the same field of data
This is my query to find all the IDs that have a date after 01/01/2022
SELECT id, eventdate FROM table WHERE eventdate >= '01/01/2022'
I can't search using NOT as this just gives me the IDs before the date, which may also have records after the date.
SELECT id, eventdate FROM table WHERE ( not ( eventdate >= '01/01/2022' ) or eventdate is null)
How should I construct this query to exclude the IDs within the date period, to leave the IDs with no date recorded?
My SQL is written in Excel VBA and connected to PostGreSQl.
Table
ID | eventdate
1 | 28/12/2023
2 | 27/08/2024
3 | 12/05/2022
What you said translates directly from human to SQL not in
: demo
select count(distinct id) --"I need to know how many IDs"
from your_table
where id not in ( --"do not have a date record"
select id
from your_table
where eventdate>'01/01/2022');--"recorded after 01/01/2022"
Be sure to add an index to speed things up:
create index on your_table(eventdate)include(id);
This will get you an index-only scan plan for the subquery because the index contains everything the query needs to check against, so it won't have to pay heap any expensive visits at all.
The demo shows more ways to achieve the same thing, including a full index-only setup for the above (two symmetric indexes scanned against each other, zero heap fetches), except
, not exists
, an anti-join, and a simple subtraction:
select (select count(distinct id) from your_table)
-(select count(distinct id) from your_table
where eventdate>'01/01/2022');