Search code examples
sql-serverssms

How to filter IDs based on dates?


I have the following table:

ID | DATES
---+-----------
1   02-09-2010
2   03-08-2011
1   08-01-2011
3   04-03-2010

I am looking for IDs who had at least one date before 05-01-2010 AND at least one date after 05-02-2010

I tried the following:

WHERE tb1.DATES < '05-01-2010' AND tb1.DATES > '05-02-2010'

I don't think it's correct because I wasn't getting the right IDs when I did that and there's something wrong with that logic.

Can someone explain what I am doing wrong here?


Solution

  • The SQL command SELECT * FROM tb1 WHERE tb1.DATES < '05-01-2010' AND tb1.DATES > '05-02-2010' is asking "find all the rows where the 'dates' field is before 1 May and after 2 May" which - when put in English - is obviously none of them.

    Instead, the command should be asking "find all the IDs which have a record that is before 1 May, and another record after 2 May" - creating the need to look at multiple records for each ID.

    As @Martheen suggested, you could do this with two (sub)queries e.g.,

    SELECT A.ID
    FROM
        (SELECT DISTINCT tb1.ID
          FROM mytable tb1
          WHERE tb1.[dates] < '20100501'
        ) AS A
    INNER JOIN
        (SELECT DISTINCT tb1.ID
          FROM mytable tb1
          WHERE tb1.[dates] > '20100502'
        ) AS B
        ON A.ID = B.ID;
    

    or using INTERSECT

    SELECT DISTINCT tb1.ID
        FROM mytable tb1
        WHERE tb1.[dates] < '20100501'
    INTERSECT
        SELECT mt2.ID
        FROM mytable mt2
        WHERE mt2.[dates] > '20100502';
    

    The use of DISTINCT in the above is so that you only get one row per ID, no matter how many rows they have before/after the relevant dates.

    You could also do it via GROUP BY and HAVING - which in this particular case is easy as if any dates are before 1 May, then their earliest date must be before 1 May (and correspondingly for their max data and 2 May) e.g.,

    SELECT mt1.ID
        FROM mytable mt1
        GROUP BY mt1.ID
        HAVING MIN(mt1.[dates]) < '20100501' AND MAX(mt1.[dates]) > '20100502';
    

    Here is a db<>fiddle with all 3 of these; all provide the same answer (one row, with ID = 1).

    Finally, you should use an unambiguous format for your dates. My preferred one of these is 'yyyyMMdd' with no dashes/slashes/etc (as these make them ambiguous).

    Different countries/servers/etc will convert the dates you have there differently e.g., SQL Server UTC string comparison not working