sql oracle-database plsql group-by grouping

Grouping of Sub-Tables while Maintaining Ranges of Time

I'm working with a table that maintains a history of records for each key, alongside a START_DATE and END_DATE field that signifies when the record changes go in and out of action, respectively:

KEY	START_DATE	END_DATE	FIELD_ONE	FIELD_TWO
1	2000-01-01	2000-01-02	10	20
1	2000-01-03	2000-01-04	10	21
1	2000-01-05	2000-01-06	11	21
1	2000-01-07	2000-01-08	10	20
1	2000-01-09	2000-01-10	10	21

FIELD_TWO is not relevant for my purposes, and so I'd like to remove it in my extract. I end up with a table like this, in doing so:

KEY	START_DATE	END_DATE	FIELD_ONE
1	2000-01-01	2000-01-02	10
1	2000-01-03	2000-01-04	10
1	2000-01-05	2000-01-06	11
1	2000-01-07	2000-01-08	10
1	2000-01-09	2000-01-10	10

But now, I have records that are pretty much duplicates of one another - the START_DATE and END_DATE fields are signifying a change that cannot be seen in my sub-table. And so I'd like to "condense" the table to the following result:

KEY	START_DATE	END_DATE	FIELD_ONE
1	2000-01-01	2000-01-04	10
1	2000-01-05	2000-01-06	11
1	2000-01-07	2000-01-10	10

How can this be achieved?

I, naively, tried to use a GROUP BY over KEY and FIELD_ONE with a MIN() and MAX() function applied to the START_DATE and END_DATE fields, but this does not work for the above example, where FIELD_ONE changes to and from one value (10, in this example). You, of course, end up with the following:

KEY	START_DATE	END_DATE	FIELD_ONE
1	2000-01-01	2000-01-10	10
1	2000-01-05	2000-01-06	11

I'm working with PL/SQL, but can make the necessary translations from other SQL flavours as long as PL/SQL has the same tools.

Thank you!

Solution

This is a Gaps & Island problem. You can use the traditional solution:

select key,
  min(start_date) as start_date,
  max(end_date) as end_date, 
  field_one 
from (
  select
    x.*, 
    sum(case when prev_one = field_one and prev_end + 1 = start_date
        then 0 else 1 end) over(partition by key order by start_date) as g
  from (
    select t.*,
      lag(field_one) over(partition by key order by start_date) as prev_one,
      lag(end_date) over(partition by key order by start_date) as prev_end
    from t
  ) x
) y
group by key, g, field_one

Result:

 KEY  START_DATE  END_DATE   FIELD_ONE 
 ---- ----------- ---------- --------- 
 1    01-JAN-00   04-JAN-00  10        
 1    05-JAN-00   06-JAN-00  11        
 1    07-JAN-00   10-JAN-00  10

See running example at db<>fiddle.