Search code examples
sqlsql-serverdatewindow-functionsgaps-and-islands

select rows in sql with end_date >= start_date for each ID repeated multiple times


Attached the image how the data looks like. In my table I have 3 columns id, start date, and end date, and values like this:

id     start date  end date
-------------------------------
100    2015-01-01  2015-12-31
100    2016-01-10  2018-12-31
200    2015-02-15  2016-03-15
200    2016-03-15  2016-12-31
300    2016-01-01  2016-12-31
400    2017-01-01  2017-12-31
500    2017-02-01  2017-12-31
600    2017-01-15  2017-03-05
600    2017-02-01  2018-12-31

I want my output to be

id     start date  end date
--------------------------------
100    2015-01-01  2015-12-31
100    2016-01-10  2018-12-31
200    2015-02-15  2016-12-31
300    2016-01-01  2016-12-31
400    2017-01-01  2017-12-31
500    2017-02-01  2017-12-31
600    2017-01-15  2018-12-31

Query:

select 
    id, *
from 
    dbo.test_sl 
where 
    id in (select id
           from dbo.test_sl 
           where end_date >= start_date 
           group by id)

Please help me get the output I am looking for.

enter image description here


Solution

  • This is an example of a gaps-and-islands problem. In this case, you want to find adjacent rows that do not overlap for the same id. These are the starts of groups. A cumulative sum of the starts of a group providing a grouping number, which can be used for aggregation.

    In a query, this looks like:

    select id, min(startdate), max(enddate)
    from (select t.*,
                 sum(isstart) over (partition by id order by startdate) as grp
          from (select t.*,
                       (case when exists (select 1
                                          from test_sl t2
                                          where t2.id = t.id and
                                                t2.startdate < t.startdate and
                                                t2.enddate >= t.startdate
                                         )
                             then 0 else 1
                         end) as isstart
                from test_sl t
               ) t
          ) t
    group by id, grp;