Search code examples
sql-serversumsubquerywindow-functionsrecursive-query

Recursive query to use a date returned in initial query as limit in subsequent query


I have a business need to project when a specific task needs to be done based on the usage of a task.
For example, you need to change the oil in your car every 3000 miles. Some days you drive 300 miles, and other days you drive 500 miles. When you hit 3000, you change the oil, and restart the counter. Based on a projected usage table, return a set of all the oil change dates.

I could do this in a table-valued function or some other 'coded' solution.
But I thought I could do it in one statement, a recursive cte perhaps.
I'm having difficulties 'joining' the next date into the WHERE of the recursive part.
And SQL doesn't like 'TOP 1' in a recursive CTE at all. :)

I would like a set like this:
enter image description here

This is what I've got:

WITH cte_MilesMX (RateDate,RunningRateMiles)
AS
(
    -- Initial query
    SELECT TOP 1 *
    FROM (
      SELECT 
      RateDate,
      SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
      FROM dbo.RatesbyDay
      WHERE RateDate > '2020-01-01') q1
    WHERE q1.RunningRateMiles >= 3000
    UNION ALL
    -- Recursive part
    SELECT TOP 1 *
    FROM (
      SELECT 
      rbd.RateDate,
      SUM(RateMiles) OVER (ORDER BY rbd.RateDate) AS RunningRateMiles
      FROM dbo.RatesbyDay rbd
      JOIN cte_MilesMX cte
        ON 1 = 1
      WHERE rbd.RateDate > cte.RateDate) q1
    WHERE q1.RunningRateMiles >= 3000
)

SELECT *
FROM   cte_MilesMX

If you want to fool with this, here is the example:
Any help would be greatly appreciated. Thanks.

CREATE TABLE RatesbyDay(
    RateDate DATE,
    RateMiles INT);
INSERT INTO RatesbyDay VALUES ('2020-01-01',600)
INSERT INTO RatesbyDay VALUES ('2020-01-02',450)
INSERT INTO RatesbyDay VALUES ('2020-01-03',370)
INSERT INTO RatesbyDay VALUES ('2020-01-04',700)
INSERT INTO RatesbyDay VALUES ('2020-01-05',100)
INSERT INTO RatesbyDay VALUES ('2020-01-06',480)
INSERT INTO RatesbyDay VALUES ('2020-01-07',430)
INSERT INTO RatesbyDay VALUES ('2020-01-08',200)
INSERT INTO RatesbyDay VALUES ('2020-01-09',590)
INSERT INTO RatesbyDay VALUES ('2020-01-10',380)
INSERT INTO RatesbyDay VALUES ('2020-01-11',220)
INSERT INTO RatesbyDay VALUES ('2020-01-12',320)
INSERT INTO RatesbyDay VALUES ('2020-01-13',360)
INSERT INTO RatesbyDay VALUES ('2020-01-14',600)
INSERT INTO RatesbyDay VALUES ('2020-01-15',450)
INSERT INTO RatesbyDay VALUES ('2020-01-16',475)
INSERT INTO RatesbyDay VALUES ('2020-01-17',300)
INSERT INTO RatesbyDay VALUES ('2020-01-18',190)
INSERT INTO RatesbyDay VALUES ('2020-01-19',435)
INSERT INTO RatesbyDay VALUES ('2020-01-20',285)
INSERT INTO RatesbyDay VALUES ('2020-01-21',350)
INSERT INTO RatesbyDay VALUES ('2020-01-22',410)
INSERT INTO RatesbyDay VALUES ('2020-01-23',250)
INSERT INTO RatesbyDay VALUES ('2020-01-24',300)
INSERT INTO RatesbyDay VALUES ('2020-01-25',250)
INSERT INTO RatesbyDay VALUES ('2020-01-26',650)
INSERT INTO RatesbyDay VALUES ('2020-01-27',180)
INSERT INTO RatesbyDay VALUES ('2020-01-28',280)
INSERT INTO RatesbyDay VALUES ('2020-01-29',200)
INSERT INTO RatesbyDay VALUES ('2020-01-30',100)
INSERT INTO RatesbyDay VALUES ('2020-01-31',100)

-- this returns the 1st oil change assuming we just changed it on 1-1-2020
SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-01') q1
WHERE q1.RunningRateMiles >= 3000

-- the above query returned 1-9-2020 as the oil change, so when is the next one.
SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-09') q1
WHERE q1.RunningRateMiles >= 3000

-- etc. etc.
SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-17') q1
WHERE q1.RunningRateMiles >= 3000

SELECT TOP 1 *
FROM (
  SELECT 
    RateDate,
    SUM(RateMiles) OVER (ORDER BY RateDate) AS RunningRateMiles
FROM dbo.RatesbyDay
WHERE RateDate > '2020-01-26') q1
WHERE q1.RunningRateMiles >= 3000

Solution

  • This isn't a recursive CTE but it does do what you're what you're trying to do. The technique goes by a couple different names... Usually either "Quirky Update" or "Ordered Update".

    First thing, notice that I added two new columns to your table and a clustered index. They are in fact necessary but if are unwilling or unable to modify the existing table, this works just as well with a #TempTable.

    For more detailed information, see Solving the Running Total and Ordinal Rank Problems (Rewritten)

    Also... fair warning, this technique isn't without it's detractors due to the fact that Microsoft doesn't guarantee that it will work as expected.

    USE tempdb;
    GO
    
    IF OBJECT_ID('tempdb.dbo.RatesByDay', 'U') IS NOT NULL 
    BEGIN DROP TABLE tempdb.dbo.RatesByDay; END;
    GO
    
    CREATE TABLE tempdb.dbo.RatesByDay (
        RateDate date NOT NULL
            CONSTRAINT pk_RatesByDay PRIMARY KEY CLUSTERED (RateDate), -- clustered index is needed to control the direction of the update.
        RateMiles int NOT NULL,
        IsChangeDay bit NULL,
        MilesSinceLastChange int NULL
        );
    GO
    
    INSERT tempdb.dbo.RatesByDay (RateDate, RateMiles) VALUES
        ('2020-01-01',600),('2020-01-02',450),('2020-01-03',370),('2020-01-04',700),('2020-01-05',100),('2020-01-06',480),
        ('2020-01-07',430),('2020-01-08',200),('2020-01-09',590),('2020-01-10',380),('2020-01-11',220),('2020-01-12',320),
        ('2020-01-13',360),('2020-01-14',600),('2020-01-15',450),('2020-01-16',475),('2020-01-17',300),('2020-01-18',190),
        ('2020-01-19',435),('2020-01-20',285),('2020-01-21',350),('2020-01-22',410),('2020-01-23',250),('2020-01-24',300),
        ('2020-01-25',250),('2020-01-26',650),('2020-01-27',180),('2020-01-28',280),('2020-01-29',200),('2020-01-30',100),
        ('2020-01-31',100);
    
    --=====================================================================================================================
    
    DECLARE 
        @RunningMiles int = 0,
        @Anchor date;
    
    UPDATE rbd SET          
        @RunningMiles = rbd.MilesSinceLastChange = CASE WHEN @RunningMiles < 3000 THEN @RunningMiles ELSE 0 END + rbd.RateMiles,
        rbd.IsChangeDay = CASE WHEN @RunningMiles < 3000 THEN 0 ELSE 1 END,
        @Anchor = rbd.RateDate
    FROM
        dbo.RatesByDay rbd WITH (TABLOCKX, INDEX (1))
    WHERE 1 = 1
        AND rbd.RateDate > '2020-01-01'
    OPTION (MAXDOP 1);
    
    -------------------------------------
    
    SELECT * FROM dbo.RatesByDay rbd;
    

    And the results...

    RateDate   RateMiles   IsChangeDay MilesSinceLastChange
    ---------- ----------- ----------- --------------------
    2020-01-01 600         NULL        NULL
    2020-01-02 450         0           450
    2020-01-03 370         0           820
    2020-01-04 700         0           1520
    2020-01-05 100         0           1620
    2020-01-06 480         0           2100
    2020-01-07 430         0           2530
    2020-01-08 200         0           2730
    2020-01-09 590         1           3320
    2020-01-10 380         0           380
    2020-01-11 220         0           600
    2020-01-12 320         0           920
    2020-01-13 360         0           1280
    2020-01-14 600         0           1880
    2020-01-15 450         0           2330
    2020-01-16 475         0           2805
    2020-01-17 300         1           3105
    2020-01-18 190         0           190
    2020-01-19 435         0           625
    2020-01-20 285         0           910
    2020-01-21 350         0           1260
    2020-01-22 410         0           1670
    2020-01-23 250         0           1920
    2020-01-24 300         0           2220
    2020-01-25 250         0           2470
    2020-01-26 650         1           3120
    2020-01-27 180         0           180
    2020-01-28 280         0           460
    2020-01-29 200         0           660
    2020-01-30 100         0           760
    2020-01-31 100         0           860