Search code examples
sqlcommon-table-expressionrecursive-cte

SQL recursive cte - sort overlapping range


I have a table like below

range_from range_end created_on
100 105 2023-01-01
106 110 2023-01-01
106 111 2023-06-01
120 130 2023-01-01
112 130 2023-06-01

I need to clean it to get a most updated without overlapping table.

The result should look like this

range_from range_end created_on
100 105 2023-01-01
106 111 2023-06-01
112 130 2023-06-01

The problem I have is I don't know how to let the loop keep running after 'where' or how to delete a row under select statement. My code is:

with recursive bin_range as (
        select bin_from, bin_end, created_on,
                       row_number() over(order by bin_from) rownum,
                from raw_BIN
                qualify rownum =1
  
         union all

        select v.bin_from, v.bin_end, v.created_on, 
               v.rownum
                from bin_range B
                join ( select bin_from, bin_end, created_on,
                        row_number() over(order by bin_from) rownum,
                         from raw_BIN
                )V on B.rownum +1 = V.rownum
        where v.bin_from < b.bin_end
)
select * from bin_range;

Thank you so much!!!


Solution

  • This is how I'm interpreting what you are looking for. Of course this assumes your sample represents all the complexity you need to handle.

    Since you didn't tag a DBMS, this is in SQL Server. But this should run fine on just about anything.

    Note that will discard rows that overlap at all with a record with a later create time.

    https://dbfiddle.uk/cYrHc9Gw

    create table raw_BIN  
      ( range_from integer,
        range_end integer,
        created_on date);
    
    insert into raw_BIN  values (100,105,'2023-01-01');
    insert into raw_BIN  values (106,110,'2023-01-01');
    insert into raw_BIN  values (106,111,'2023-06-01');
    insert into raw_BIN  values (120,130,'2023-01-01');
    insert into raw_BIN  values (112,130,'2023-06-01');
    
    SELECT
      *
    FROM
      raw_BIN t1
    WHERE
      NOT EXISTS (
        SELECT
          1
        FROM
          raw_BIN t2
        WHERE
          t2.created_on > t1.created_on
          AND t1.range_end >= t2.range_from
          AND t1.range_from <= t2.range_end
      )
    order
       by range_from,
          range_end