Search code examples
sqldategoogle-bigquerysql-server-2012range

Dynamically generating date range starts in SQL


Imagine you have a set of dates. You want any date which is within X days of the lowest date to be "merged" into that date. Then you want to repeat until you have merged all date points.

For example:

ID DatePoints
1 2023-01-01
2 2023-01-02
3 2023-01-12
4 2023-01-21
5 2023-02-01
6 2023-02-02
7 2023-03-01

If you applied this rule to this data using 10 days as your X, you would end up with this output:

DateRangeStarts
2023-01-01
2023-01-12
2023-02-01
2023-03-01

IDs 1 and 2 into range 1, IDs 3 and 4 into range 2, IDs 5 and 6 into range 3, and ID 7 into range 4.

Is there any way to do this without a loop? Answer can work in SQL Server or BigQuery. Thanks


Solution

  • You could consider something like the following. It's not pretty and I'm not at all confident it is the best solution, but I do think it works. Maybe it's a good starting point for you to work from.

    WITH cte AS
    (
      SELECT min(datepoint) datepoint
      FROM test
      UNION ALL
      SELECT min(t.datepoint) OVER() datepoint
      FROM test t CROSS APPLY (SELECT max(cte.datepoint) OVER() md FROM cte) c
      WHERE t.datepoint > DATEADD(DAY, 10, c.md)
    )
    SELECT distinct datepoint
    FROM cte
    ORDER BY datepoint
    

    (You might want to change the > to a >=, depending on what counts as within X days.)

    The basic idea is to get the minimum date from your table into the cte, then recursively get the minimum date from your table that is bigger than the current maximum date in the cte + X days.

    It gets messy because of the limitations SQL Server places on recursive CTEs. They can't be used in subqueries, with normal OUTER JOINs, or with aggregate functions. Therefore, I use CROSS APPLY and the window versions of min/max. This gets the correct result, but multiple times, so I'm forced to use DISTINCT to clean it up afterward.

    Depending on your data, it might be better to do a loop anyway, but I think this is an option to consider.

    Here's a Fiddle of it working.