Search code examples
oracle-databaseplsqlnested-loops

Print non-overlapping date blocks in oracle


I have the following table:

ID START_DATE    END_DATE           
 1 01.06.2015  20.06.2015 
 2 05.06.2015  25.06.2015 
 3 03.06.2015  10.06.2015 
 4 07.06.2015  23.06.2015 
 5 21.06.2015  30.06.2015 
 6 02.06.2015  10.06.2015 
 7 05.06.2015  15.06.2015 
 8 05.06.2015  08.06.2015 
 9 16.06.2015  20.06.2015 

There are overlapping timelines in the table. For eg, 03/06/2015-10/06/2015 resides between 1/06/2015-20/06/2015. Infact the non-overlapping timelines are 1/06/2015,05/06/2015 and 21/06/2015. I have to retrieve these values. I am aware of the fact that I have to use nested loop to compare a particular date with every other date. What I have done is:

DECLARE
 min_sd DATE;
 max_ed DATE;
 sd DATE;
 ed DATE;
 i INT:=1;
 j INT:=1;

PROCEDURE date_block
IS

BEGIN

  WHILE i<=9 loop
  SELECT start_date,end_date INTO min_sd,max_ed FROM sd_ed WHERE id=i;

    WHILE j<=9 loop
     SELECT start_date,end_date INTO sd,ed FROM sd_ed WHERE id=j;

      IF min_sd<=sd AND max_ed>=ed THEN
        j:=j+1;

      ELSE
        Dbms_Output.put_line(sd||' - '||ed);
        j:=j+1;

       END IF;
      i:=i+1;

    END LOOP;

  END loop;
END;

BEGIN
  date_block();
END;

The output I am getting is:

05-JUN-15 - 25-JUN-15
07-JUN-15 - 23-JUN-15
21-JUN-15 - 30-JUN-15

I think the dates are not being compared to the dates present above them in the table.Can anyone help? I am using oracle.

Using SQL only,

SELECT a.*
FROM   (SELECT *
        FROM  sd_ed
        ORDER BY id) a
WHERE   NOT EXISTS (SELECT 1
               FROM   sd_ed_test b
               WHERE  b.start_date <= a.start_date
               AND    b.end_date   >= a.end_date
               AND    b.id         <  a.id
               );

Solution

  • Here's a pure SQL solution:

    SQL Fiddle

    Oracle 11g R2 Schema Setup:

    create table sd_ed as with sd_ed(ID, START_DATE, END_DATE) as (
    select 1, to_date('01.06.2015','dd.mm.yyyy'), to_date('20.06.2015','dd.mm.yyyy') from dual union all
    select 2, to_date('05.06.2015','dd.mm.yyyy'), to_date('25.06.2015','dd.mm.yyyy') from dual union all
    select 3, to_date('03.06.2015','dd.mm.yyyy'), to_date('10.06.2015','dd.mm.yyyy') from dual union all
    select 4, to_date('07.06.2015','dd.mm.yyyy'), to_date('23.06.2015','dd.mm.yyyy') from dual union all
    select 5, to_date('21.06.2015','dd.mm.yyyy'), to_date('30.06.2015','dd.mm.yyyy') from dual union all
    select 6, to_date('02.06.2015','dd.mm.yyyy'), to_date('10.06.2015','dd.mm.yyyy') from dual union all
    select 7, to_date('05.06.2015','dd.mm.yyyy'), to_date('15.06.2015','dd.mm.yyyy') from dual union all
    select 8, to_date('05.06.2015','dd.mm.yyyy'), to_date('08.06.2015','dd.mm.yyyy') from dual union all
    select 9, to_date('16.06.2015','dd.mm.yyyy'), to_date('20.06.2015','dd.mm.yyyy') from dual
    ) select * from sd_ed;
    

    Query 1:

    with super as(
      -- Select the super ranges.  All other ranges are
      -- completely contained in at least one super range
      select id, start_date, end_date
        from sd_ed a
       where not exists (select 1
                           from sd_ed b
                          where a.id <> b.id
                            and b.start_date <= a.start_date
                            and a.end_date <= b.end_date)
    ), hier(id, start_date, end_date) as (
    -- Select all record with a start date not between
    -- any other records start and end dates
    select id, start_date, end_date from super
     where not exists(select 1 from super d1
                       where d1.id <> super.id
                         and super.start_date between d1.start_date and d1.end_date
                         and super.start_date <> d1.start_date)
    -- Recursively select records that overlap the current range
    -- but with end dates after the end date of the current range
    union all
    select sd_ed.id
         , prev.start_date
         , greatest(sd_ed.end_date, prev.end_date)
      from hier prev
      join sd_ed
        on sd_ed.id <> prev.id
       and sd_ed.start_date <= prev.end_date
       and prev.end_date < sd_ed.end_Date
    )
    -- Get the max end_date for each start date.
    -- Start Dates are already minimum for any range.
    select start_date, max(end_Date) end_date
      from hier
     group by start_date
    

    Results:

    |             START_DATE |               END_DATE |
    |------------------------|------------------------|
    | June, 01 2015 00:00:00 | June, 30 2015 00:00:00 |
    

    For this data set identifying the super ranges isn't really needed since the hierarchical query will handle it just fine, but for larger data sets this initial pruning will reduced the amount of work that needs to be done by the hierarchical query.