Search code examples
sqlsql-serveroverlapgaps-and-islands

Find common date range from a set of overlapping date ranges


How to find the overlapped (common) date ranges from a group of given date ranges?

Need to find the Overlapping (common) Date-Ranges considering all the Events (EID) for a particular Program (PID).

Example: Program (PID=13579) had two Date Ranges for Event (EID=2).

Previously Posted at link

I had Already checked here (But Not useful): Link

Image for Visualizing Overlapped (Common) Date Ranges

The Sample Schema and Data:

CREATE TABLE #EventsTBL
(
    PID INT,
    EID INT,
    StartDate DATETIME,
    EndDate DATETIME
);

INSERT INTO #EventsTBL
VALUES
(13579, '1', '01 Jan 2018', '31 Mar 2019'),
(13579, '2', '01 Feb 2018', '31 May 2018'),
(13579, '2', '01 Jul 2018', '31 Jan 2019'),
(13579, '7', '01 Mar 2018', '31 Mar 2019'),
(13579, '5', '01 Feb 2018', '30 Apr 2018'),
(13579, '5', '01 Oct 2018', '31 Mar 2019'),
(13579, '8', '01 Jan 2018', '30 Apr 2018'),
(13579, '8', '01 Jun 2018', '31 Dec 2018'),
(13579, '13', '01 Jan 2018', '31 Mar 2019'),
(13579, '6', '01 Apr 2018', '31 May 2018'),
(13579, '6', '01 Sep 2018', '30 Nov 2018'),
(13579, '4', '01 Feb 2018', '31 Jan 2019'),
(13579, '19', '01 Mar 2018', '31 Jul 2018'),
(13579, '19', '01 Oct 2018', '28 Feb 2019'),
--
(13570, '16', '01 Feb 2018', '30 Jun 2018'),
(13570, '16', '01 Aug 2018', '31 Aug 2018'),
(13570, '16', '01 Oct 2018', '28 Feb 2019'),
(13570, '23', '01 Mar 2018', '30 Jun 2018'),
(13570, '23', '01 Nov 2018', '31 Jan 2019');

Output should be:

PID     StartDate       EndDate
13579   01-Apr-2018     30-Apr-2018
13579   01-Oct-2018     30-Nov-2018
13570   01-Mar-2018     30-Jun-2018
13570   01-Nov-2018     31-Jan-2019

Solution

  • This answer counts the number of overlapping intervals. It assumes that date ranges having same EID do not overlap. Below is the query with explanation inline:

    DECLARE @EventsTBL TABLE (PID INT, EID INT, StartDate DATETIME, EndDate DATETIME);
    INSERT INTO @EventsTBL VALUES
    (13579, 1,  '01 Jan 2018', '31 Mar 2019'),
    (13579, 2,  '01 Feb 2018', '31 May 2018'),
    (13579, 2,  '01 Jul 2018', '31 Jan 2019'),
    (13579, 7,  '01 Mar 2018', '31 Mar 2019'),
    (13579, 5,  '01 Feb 2018', '30 Apr 2018'),
    (13579, 5,  '01 Oct 2018', '31 Mar 2019'),
    (13579, 8,  '01 Jan 2018', '30 Apr 2018'),
    (13579, 8,  '01 Jun 2018', '31 Dec 2018'),
    (13579, 13, '01 Jan 2018', '31 Mar 2019'),
    (13579, 6,  '01 Apr 2018', '31 May 2018'),
    (13579, 6,  '01 Sep 2018', '30 Nov 2018'),
    (13579, 4,  '01 Feb 2018', '31 Jan 2019'),
    (13579, 19, '01 Mar 2018', '31 Jul 2018'),
    (13579, 19, '01 Oct 2018', '28 Feb 2019'),
    (13570, 16, '01 Feb 2018', '30 Jun 2018'),
    (13570, 16, '01 Aug 2018', '31 Aug 2018'),
    (13570, 16, '01 Oct 2018', '28 Feb 2019'),
    (13570, 23, '01 Mar 2018', '30 Jun 2018'),
    (13570, 23, '01 Nov 2018', '31 Jan 2019');
    
    WITH cte1 AS (
        /*
         * augment the data with the number of distinct EID per PID
         * we will need this later
         */
        SELECT e.PID, a.EIDCount, StartDate, EndDate
        FROM @EventsTBL AS e
        JOIN (
            SELECT PID, COUNT(DISTINCT EID) AS EIDCount
            FROM @EventsTBL
            GROUP BY PID
        ) AS a ON e.PID = a.PID
    ), cte2 AS (
        /*
         * build a list of "points in time" at which an event started or ended
         * and the number concurrent events changed
         * the zero value rows are required!
         */
        SELECT PID, EIDCount, StartDate AS pdate, 1 AS pval
        FROM cte1
        UNION ALL
        SELECT PID, EIDCount, EndDate, 0
        FROM cte1
        UNION ALL
        SELECT PID, EIDCount , DATEADD(DAY, 1, EndDate), -1
        FROM cte1
    ), cte3 AS (
        /*
         * calculate running sum of pval over dates; minus ones first
         */
        SELECT PID, EIDCount, pdate, SUM(pval) OVER (PARTITION BY PID ORDER BY pdate, pval) AS evtcount
        FROM cte2
    ), cte4 AS (
        /*
         * consolidate data for same dates and we are done with the main part
         */
        SELECT PID, EIDCount, pdate, MAX(evtcount) AS evtcount
        FROM cte3
        GROUP BY PID, EIDCount, pdate
    ), cte5 AS (
        /*
         * assign "change flag" to rows where number of concurrent events
         * enters or exits the required count w.r.t. previous row
         */
        SELECT PID, EIDCount, pdate, evtcount, CASE
            WHEN evtcount < EIDCount AND LAG(evtcount) OVER (PARTITION BY PID ORDER BY pdate) < EIDCount THEN 0
            WHEN evtcount = EIDCount AND LAG(evtcount) OVER (PARTITION BY PID ORDER BY pdate) = EIDCount THEN 0
            ELSE 1
        END AS chg
        FROM cte4
    ), cte6 AS (
        /*
         * convert "change flag" to "group numbers" over consecutive rows using running sum
         */
        SELECT PID, EIDCount, pdate, evtcount, SUM(chg) OVER (PARTITION BY PID ORDER BY pdate) AS grp
        FROM cte5
    )
    /*
     * group rows by pid and group numbers
     */
    SELECT PID, MIN(pdate) AS StartDate, MAX(pdate) AS EndDate
    FROM cte6
    WHERE evtcount = EIDCount
    GROUP BY PID, grp
    ORDER BY PID, StartDate
    

    Demo on db<>fiddle