Search code examples
sqlgaps-and-islandsoracle19c

find a gap in a given date range/year with multiple or single date ranges in Oracle SQL


I have such a table with date ranges, There could be 1 or multiple date ranges, overlapping or consecutive as well. I want to find out if there is a gap and if then show the missing date range for a part. the gap should be tested for the year 2024

part    start   end
a   1.1.2023    31.3.2024
a   1.1.2023    31.12.2025
a   1.7.2024    31.12.2025
b   1.1.2024    30.6.2024
b   1.7.2024    31.12.2024
c   1.10.2023   30.9.2024
d   1.1.2023    30.6.2024
d   1.12.2024   31.12.2025

so for part A there is no gap, 2 ranges are not consecutive but one of them cover the whole year 2024, so the part a should be filtered out. part B has 2 ranges and they together cover the year 2024, thus filtered out. part C has 1 date range till 30.9.2024 , so new date range will be start = 1.10.2024 and end = 31.12.2024. part D has 2 date ranges and there is a gap in 2024,this new date range will be start=1.7.2024 and end 30.11.2024.

desired output

part    start       end
c       1.10.2024   31.12.2024
d       1.7.2024    31.12.2024

Solution

  • You can UNPIVOT the start- and end-dates to a single column and then use analytic functions to generate a cumulative count the number of starts minus ends to find when the ranges start and end and then UNPIVOT again to find the pairs (so if there is a gap in the range before and a gap in the range after then a single row is converted to two rows for both parts of the missing range) and then filter:

    SELECT part,
           COALESCE(start_dt, DATE '2024-01-01') AS start_dt,
           end_dt - 1 AS end_dt
    FROM   (
      SELECT part,
             LAG(dt) OVER (PARTITION BY part ORDER BY dt)
               AS prev_dt,
             dt,
             LEAD(dt, 1, DATE '2025-01-01') OVER (PARTITION BY part ORDER BY dt)
               AS next_dt,
             SUM(type) OVER (PARTITION BY part ORDER BY dt) AS in_range
      FROM   (
        SELECT part, dt, SUM(type) AS type
        FROM   (
          SELECT part, start_date, end_date + 1 AS end_date
          FROM   table_name
        )
        UNPIVOT (dt FOR type IN (start_date AS 1, end_date AS -1))
        GROUP BY part, dt
      )
    )
    UNPIVOT (
      (start_dt, end_dt) FOR range_type IN (
        (prev_dt, dt) AS 'before',
        (dt, next_dt) AS 'after'
      )
    )
    WHERE  (   start_dt IS NULL
           AND DATE '2024-01-01' < end_dt
           AND in_range > 0
           AND range_type = 'before'
           )
    OR     (   start_dt < DATE '2025-01-01'
           AND in_range = 0
           AND range_type = 'after'
           )
    

    Which, for the sample data:

    CREATE TABLE table_name (part, start_date, end_date) AS
    SELECT 'a', DATE '2023-01-01', DATE '2024-03-31' FROM DUAL UNION ALL
    SELECT 'a', DATE '2023-01-01', DATE '2025-12-31' FROM DUAL UNION ALL
    SELECT 'a', DATE '2024-07-01', DATE '2025-12-31' FROM DUAL UNION ALL
    SELECT 'b', DATE '2024-01-01', DATE '2024-06-30' FROM DUAL UNION ALL
    SELECT 'b', DATE '2024-07-01', DATE '2024-12-31' FROM DUAL UNION ALL
    SELECT 'c', DATE '2023-10-01', DATE '2024-09-30' FROM DUAL UNION ALL
    SELECT 'd', DATE '2023-01-01', DATE '2024-06-30' FROM DUAL UNION ALL
    SELECT 'd', DATE '2024-12-01', DATE '2025-12-31' FROM DUAL UNION ALL
    SELECT 'e', DATE '2024-07-01', DATE '2025-01-01' FROM DUAL UNION ALL
    SELECT 'f', DATE '2024-04-01', DATE '2024-09-30' FROM DUAL UNION ALL
    SELECT 'g', DATE '2023-01-01', DATE '2023-12-31' FROM DUAL UNION ALL
    SELECT 'h', DATE '2025-01-01', DATE '2025-12-31' FROM DUAL;
    

    Outputs:

    PART START_DT END_DT
    c 2024-10-01 00:00:00 2024-12-31 00:00:00
    d 2024-07-01 00:00:00 2024-11-30 00:00:00
    e 2024-01-01 00:00:00 2024-06-30 00:00:00
    f 2024-01-01 00:00:00 2024-03-31 00:00:00
    f 2024-10-01 00:00:00 2024-12-31 00:00:00
    g 2024-01-01 00:00:00 2024-12-31 00:00:00
    h 2024-01-01 00:00:00 2024-12-31 00:00:00

    fiddle