Search code examples
sqlsql-servergaps-and-islands

Verify consecutive codes around a date


I have two tables:
Table 1 consists of ID and INDX. INDX is the date around the CODE from Table2 needs to be verified.
Table2 consists of ID, CODE, DAT, QRTR. DAT is the date on which the CODE is given with corresponding quarter of the year in QRTR.

Table1:

ID INDX
1 2014-06-07

Table2:

ID CODE DAT QRTR
1 A 2013-08-20 20133
1 A 2013-12-20 20134
1 A 2014-01-02 20141
1 A 2014-11-18 20144
1 A 2015-02-08 20151
1 A 2015-04-20 20152
1 A 2015-11-16 20154
1 A 2016-04-12 20162
1 A 2017-01-20 20171
1 A 2017-12-12 20174
1 A 2018-01-03 20181
1 A 2019-05-20 20193
1 A 2020-01-01 20201
1 A 2021-04-16 20212
1 A 2021-08-20 20213
1 A 2021-12-30 20214

Now I need to know if CODE 'A' was present within 300 days until INDX date.
If so, I need to verify if CODE 'A' is given at least one QRTR per year starting with the first data entry of CODE 'A' (within 300 days prior INDX) including the future CODEs.

I tried the following:

SELECT
    t1.ID,
    years.Year,
    CASE
        WHEN EXISTS (
            SELECT 1
            FROM table2 t2
            WHERE t2.ID = t1.ID
                AND t2.CODE = 'A'
                AND years.Year = YEAR(DATEADD(QUARTER, t2.QRTR % 10 - 1, DATEADD(YEAR, t2.QRTR / 10 - 1900, '19000101')))
        ) THEN 'Yes'
        ELSE 'No'
    END AS IsQuarterly
FROM
    (
        SELECT DISTINCT ID
        FROM table1
    ) AS t1
CROSS JOIN
    (
        SELECT DISTINCT YEAR(DATEADD(QUARTER, (QRTR % 10) - 1, DATEADD(YEAR, (QRTR / 10) - 1900, '19000101'))) AS Year
        FROM table2
        WHERE CODE = 'A'
    ) AS years
ORDER BY
    t1.ID,
    years.Year;

Data: db<>fiddle.

For the sake of presentation clarity I just included ID 1 in the tables above.

With my query i just get if at least one QRTR a year CODE 'A' is present but i dont get information until which time CODE 'A' was present and whether there has been a new period of CODE 'A'.

I expect the following, because i think its better to derive a period on which the CODE 'A' was present at least one QRTR a year.

ID START END
1 2013-08-20 2021-12-30
2 2017-12-12 2019-05-20
2 2021-08-20 2021-12-30

I am not sure what will be the best solution to display the result.


Solution

  • This is a classic Gaps-and-Islands problem.

    You first need to aggregate up the data per year, and check for the total count per year.

    Then use gaps-and-islands techniques to split it into groups of rows which are consecutive years, finally taking the min and max dates for each of those groups.

    In older versions of SQL Server, use YEAR( instead of DATETRUNC(year,, the latter is more performant if available.

    WITH Data AS (
        SELECT
          t2.ID,
          Year = DATETRUNC(year, t2.DAT),
          MinDate = MIN(t2.DAT),
          Maxdate = MAX(t2.DAT),
          PrevYear =   LAG(DATETRUNC(year, t2.DAT)) OVER (PARTITION BY t2.ID ORDER BY DATETRUNC(year, t2.DAT)),
          NextYear =  LEAD(DATETRUNC(year, t2.DAT)) OVER (PARTITION BY t2.ID ORDER BY DATETRUNC(year, t2.DAT))
        FROM Table1 t1
        JOIN Table2 t2
          ON t2.ID = t1.ID
         AND t2.DAT >= DATEADD(day, -300, t1.INDX)
        GROUP BY
          t2.ID,
          DATETRUNC(year, t2.DAT)
    ),
    Islands AS (
        SELECT
          *,
          NextMaxDate = LEAD(d.MaxDate) OVER (PARTITION BY d.ID ORDER BY d.Year)
        FROM Data d
        WHERE d.PrevYear IS NULL
           OR d.PrevYear < DATEADD(year, -1, Year)
           OR d.NextYear IS NULL
           OR d.NextYear > DATEADD(year, 1, Year)
    )
    SELECT
      ID,
      Start = MinDate,
      [End] = ISNULL(NextMaxDate, MaxDate)
    FROM Islands i
    WHERE i.PrevYear IS NULL
       OR i.PrevYear < DATEADD(year, -1, Year);
    

    db<>fiddle