Search code examples
sqloracleoracle11ggaps-and-islands

Analytic function/logic to get min and max record date in Oracle


I have a requirement to fetch value based on eff_dt and end date. given below sample data.

Database : Oracle 11g

Example data:

id val eff_date end_date
10 100 01-Jan-21 04-Jan-21
10 105 05-Jan-21 07-Jan-21
10 100 08-Jan-21 10-Jan-21
10 100 11-Jan-21 17-Jan-21
10 100 18-Jan-21 21-Jan-21
10 110 22-Jan-21 null

output:

id val eff_date end_date
10 100 01-Jan-21 04-Jan-21
10 105 05-Jan-21 07-Jan-21
10 100 08-Jan-21 21-Jan-21
10 110 22-Jan-21 null

Solution

  • You can use the ROW_NUMBER analytic function and then aggregate:

    SELECT id,
           val,
           MIN(eff_date) AS eff_date,
           MAX(end_date) AS end_date
    FROM   (
      SELECT t.*,
             ROW_NUMBER() OVER (PARTITION BY id ORDER BY eff_date)
               - ROW_NUMBER() OVER (PARTITION BY id, val ORDER BY eff_date) AS grp
      FROM   table_name t
    )
    GROUP BY id, val, grp
    ORDER BY id, eff_date;
    

    Which, for the sample data:

    CREATE TABLE table_name (id, val, eff_date, end_date) AS
    SELECT 10, 100, DATE '2021-01-01', DATE '2021-01-04' FROM DUAL UNION ALL
    SELECT 10, 105, DATE '2021-01-05', DATE '2021-01-07' FROM DUAL UNION ALL
    SELECT 10, 100, DATE '2021-01-08', DATE '2021-01-10' FROM DUAL UNION ALL
    SELECT 10, 100, DATE '2021-01-11', DATE '2021-01-17' FROM DUAL UNION ALL
    SELECT 10, 100, DATE '2021-01-18', DATE '2021-01-21' FROM DUAL UNION ALL
    SELECT 10, 110, DATE '2021-01-22', null FROM DUAL;
    

    Outputs:

    ID VAL EFF_DATE END_DATE
    10 100 2021-01-01 00:00:00 2021-01-04 00:00:00
    10 105 2021-01-05 00:00:00 2021-01-07 00:00:00
    10 100 2021-01-08 00:00:00 2021-01-21 00:00:00
    10 110 2021-01-22 00:00:00 null

    From Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row processing:

    SELECT *
    FROM   table_name t
    MATCH_RECOGNIZE(
      PARTITION BY id
      ORDER BY eff_date
      MEASURES
        FIRST(val) AS val,
        FIRST(eff_date) AS eff_date,
        LAST(end_date) AS end_date
      PATTERN (same_val+)
      DEFINE same_val AS FIRST(val) = val
    )
    

    Which has the same output and is likely to be more efficient.

    fiddle