Search code examples
sqlsql-server-2008group-bysumsqldatetime

SQL SUM up date ranges that collide on a group by


I have a table with columns: name, start date (a date value) and finish date(a date value). I want to group by name adding up the dates so I get the total time with no collisions. So, if I have a table

name | start date | finish date
===============================
a    | 20/10/2015 | 22/10/2015
a    | 21/10/2015 | 22/10/2015
a    | 26/10/2015 | 27/10/2015

So, if I group by name, the 3 rows will aggregate, if I simply add the DATEDIFF day per row I'll get 4, if I calculate the DATEDIFF between the MIN start date and the MAX finish date it will be 7, when in reality the right answer would be 3, since the second row collides with the first one and I only need to count that time once.


Solution

  • Thanks for your comments below. I have used a completely different approach. First L build a calendar CTE a with all the dates that exist in your table. You may use an existing calendar table from your database if you have one. Then in the CTE b I CROSS JOIN the calendar CTE to get the dates that exist for the date ranges. In this CTE it does not matter how many overlapping ranges you have as The date will be included once only using the GROUP BY [name] clause. And now all you need to do is to count the number of the individual dates in the CTE c:

    SQL Fiddle

    MS SQL Server 2008 Schema Setup:

    CREATE TABLE Table1
        ([name] varchar(1), [start date] datetime, [finish date] datetime)
    ;
    
    INSERT INTO Table1
        ([name], [start date], [finish date])
    VALUES
        ('a', '2015-10-20 00:00:00', '2015-10-22 00:00:00'),
        ('a', '2015-10-21 00:00:00', '2015-10-22 00:00:00'),
        ('a', '2015-10-21 00:00:00', '2015-10-23 00:00:00'),
        ('a', '2015-10-26 00:00:00', '2015-10-27 00:00:00')
    ;
    

    Query 1:

    with dt as(
      select min([start date]) as sd, max([finish date]) as fd from Table1
    ),
    a as (
      select sd from dt
      union all 
      select dateadd(day, 1, a.sd)
      FROM a cross join dt
      where a.sd < fd
    ),
    b as(
      select [name], sd
      from table1 cross join a where a.sd between [start date] and [finish date]
      group by [name], sd
    ),
    c as (
      select [name], count(*) days from b group by [name]
    )
    select * from c
    option (maxrecursion 0)
    

    Results:

    | name | days |
    |------|------|
    |    a |    6 |