Search code examples
sqlsql-servernested-loopsdatabase-cursor

Trying to Loop via Cursor Through Partition Selection Until Counter Value is Satisfied


I understand that there are other potential options and a cursor may not be the most efficient choice for what I'm trying to accomplish, but be that as it may I'm hoping to gather some guidance on how to structure this query.

My Starting data looks like below which exists within a SELECT statement:

Date ID StartingValue DailyAdjstVal UnitAdjustment EndingValue SumOfUnitAdjustment (Should Equal DailyAdjstVal
12/1/2022 d 7 6 0 7 0
12/1/2022 a 5 6 0 5 0
12/1/2022 c 4 6 0 4 0
12/1/2022 b 3 6 0 3 0
12/1/2022 e 2 6 0 2 0
12/2/2022 b 8 4 0 8 0
12/2/2022 c 4 4 0 4 0
12/2/2022 a 3 4 0 3 0
12/2/2022 d 2 4 0 2 0
12/2/2022 e 2 4 0 2 0
12/3/2022 d 6 5 0 6 0
12/3/2022 a 5 5 0 5 0
12/3/2022 c 4 5 0 4 0
12/3/2022 b 3 5 0 3 0
12/3/2022 e 0 5 0 0 0

The ending Output I'm trying to produce looks like below, with the last three columns updated by the query:

Date ID StartingValue DailyAdjstVal UnitAdjustment EndingValue SumOfUnitAdjustment (Should Equal DailyAdjstVal
12/1/2022 d 7 6 3 4 6
12/1/2022 a 5 6 2 3 6
12/1/2022 c 4 6 1 3 6
12/1/2022 b 3 6 0 3 6
12/1/2022 e 2 6 0 2 6
12/2/2022 b 8 4 3 5 4
12/2/2022 c 4 4 1 3 4
12/2/2022 a 3 4 0 3 4
12/2/2022 d 2 4 0 2 4
12/2/2022 e 2 4 0 2 4
12/3/2022 d 6 5 2 4 5
12/3/2022 a 5 5 2 3 5
12/3/2022 c 4 5 1 3 5
12/3/2022 b 3 5 0 3 5
12/3/2022 e 0 5 0 0 5

The operation I'm trying to produce is the following steps:

  • Cursor should operate within partitions of the data, looking at all values for a date and completing before moving onto the next date within the set until all Date values are completed.
  • Sort the set ordering by Date DESC and StartingValue DESC
  • Set @Counter = 0
  • Beginning with the highest StartingValue for that Date and iterating down, WHILE EndingValue >3, add +1 to UnitAdjustment, -1 to EndingValue, and +1 to @Counter.
  • While @Counter < DailyAdjstVal, load the next row and perform the same as above.
  • If Cursor reaches the end of the partition set, it should start again with the first row in the series and continue until @Counter = DailyAdjstVal. After which time it should load the next date and begin the process again with the same logic.
  • Once it completes the process, it should update SumOfUnitAdjustment to total all EndingValue based on Date Partition which should, then, equal to DailyAdjstVal.

I believe for this process I'll need two cursors with the first to iterate through Days within the table set and an Inner cursor to perform the cell and @Counter updates. Apologies if there's any confusion in my steps, I appreicate any assistance!


Solution

  • So as I re-imagine the problem, for each date, we have N rows having a defined order, each with a potential adjustment = StartingValue - 3 (not less than zero), and an available adjustment DailyAdjstVal for that date that we would like to distribute across those rows.

    Although there are relatively simple ways to distribute integral amounts evenly across rows, having a per-row limit complicates things and requires a more involved approach.

    Below is SQL that gathers some initial information for each date, uses a recursive CTE to step through increasing levels of adjustments until the adjustment budget is exhausted, selects the maximum viable adjustment for each date (with residuals), and then applies those adjustments to the starting data.

    ;WITH OrderedData AS (
        -- Assign a sequence number for rows having dame date.
        -- Calculate potential adjustment per row.
        SELECT
            Date, ID, StartingValue, DailyAdjstVal,
            GREATEST(0, StartingValue - @ThresholdValue) AS PotentialAdjust,
            ROW_NUMBER() OVER(PARTITION BY Date ORDER BY StartingValue DESC, ID) AS Seq
        FROM StartingData
    ),
    DailyAdjust AS (
        -- Get potential adjustment stats per date
        SELECT OD.Date,
            MAX(OD.DailyAdjstVal) AS DailyAdjstVal,
            SUM(OD.PotentialAdjust) AS SumPotentialAdjust,
            MAX(OD.PotentialAdjust) AS MaxPotentialAdjust
        FROM OrderedData OD
        GROUP BY OD.Date
    ),
    RowsPerAdjustment AS (
        -- For each date and adjustmenr value combination, calculate the
        -- number of rows that could be adjusted to that level.
        -- Skip if DailyAdjstVal will cover all potential adjustments.
        SELECT DA.Date, S.Adjust, C.AdjustCount
        FROM DailyAdjust DA
        CROSS APPLY (Select value FROM GENERATE_SERIES(1, LEAST(DA.DailyAdjstVal, DA.MaxPotentialAdjust))) S(Adjust)
        CROSS APPLY (
            SELECT COUNT(*) AS AdjustCount
            FROM OrderedData OD
            WHERE OD.Date = DA.Date
            AND OD.PotentialAdjust >= S.Adjust
        ) C
        WHERE DA.DailyAdjstVal < DA.SumPotentialAdjust
        AND C.AdjustCount > 0
    ),
    Adjustments AS (
        -- Use a recursive CTE to effectively loop through the adjustment levels
        -- for each date until we have exhausted our available adjustment.
        -- Skip if DailyAdjstVal will cover all potential adjustments.
        -- (There might also be a way to do this using window functions at this point)
        SELECT
            DA.Date,
            0 AS Adjust,
            DA.DailyAdjstVal AS ResidualAdjust,
            DA.MaxPotentialAdjust
        FROM DailyAdjust DA
        WHERE DA.DailyAdjstVal < DA.SumPotentialAdjust
    
        UNION ALL
    
        SELECT
            A.Date,
            A.Adjust + 1 AS Adjust,
            A.ResidualAdjust - RPA.AdjustCount AS ResidualAdjust,
            A.MaxPotentialAdjust
        FROM Adjustments A
        JOIN RowsPerAdjustment RPA
            ON RPA.Date = A.Date
            AND RPA.Adjust = A.Adjust + 1
        WHERE 1=1
        AND A.ResidualAdjust > 0
        AND A.Adjust < A.MaxPotentialAdjust
        AND A.ResidualAdjust >= RPA.AdjustCount
    ),
    SelectedAdjustment AS (
        -- Now select the greatest level of adjustment.
        -- ResidualAdjust reflects the remaining adjustments allocated to less
        -- than all eligible rows
        SELECT A.date, MAX(A.Adjust) AS Adjust, MIN(A.ResidualAdjust) AS ResidualAdjust
        FROM Adjustments A
        GROUP BY A.Date
    )
    --SELECT * FROM OrderedData ORDER BY Date, Seq
    --SELECT * FROM DailyAdjust ORDER BY Date
    --SELECT * FROM RowsPerAdjustment ORDER BY Date, Adjust
    --SELECT * FROM Adjustments ORDER BY Date, Adjust
    --SELECT * FROM SelectedAdjustment ORDER BY Date
    SELECT
        -- Loose the hounds
        OD.Date, OD.ID, OD.StartingValue AS StartVal, OD.DailyAdjstVal AS DailyAdj,
        --OD.PotentialAdjust AS PA,
        --DA.SUMPotentialAdjust AS SumPA,
        --DA.MaxPotentialAdjust AS MaxPA,
        --SA.Adjust, SA.ResidualAdjust AS ResAdj,
        A.UnitAdjustment AS UnitAdj,
        OD.StartingValue - A.UnitAdjustment AS EndVal,
        SUM(A.UnitAdjustment) OVER(PARTITION BY OD.Date) AS SumAdj
    FROM OrderedData OD
    JOIN DailyAdjust DA ON DA.Date = OD.Date
    LEFT JOIN SelectedAdjustment SA ON SA.Date = OD.Date
    CROSS APPLY (
        -- Calculate full or partial adjustment
        SELECT CASE
            WHEN DA.SumPotentialAdjust <= DA.DailyAdjstVal 
            THEN OD.PotentialAdjust 
            ELSE LEAST(SA.Adjust, OD.PotentialAdjust)
                 + CASE WHEN OD.Seq < SA.ResidualAdjust THEN 1 ELSE 0 END
            END AS UnitAdjustment
    ) A
    ORDER BY OD.date, OD.Seq
    

    I believe the Adjustments, RowsPerAdjustment, and SelectedAdjustment steps can be combined into a nested query with some window functions that don't require a recursive CTE, but I am out of time at the moment. I may update this answer in a day or two.

    The above is for SQL Server 2022. If you are using an earlier version, you will need to replace references to GREATEST() and LEAST() functions with equivalent CASE statements and replace the GENERATE_SERIES() with an alternative number generator.

    See this db<>fiddle for a working example (with an extra case where the DailyAdjstVal exceeds eligible adjustments).

    Sample results.

    Date ID StartVal DailyAdj UnitAdj EndVal SumAdj
    2022-12-01 d 7 6 3 4 6
    2022-12-01 a 5 6 2 3 6
    2022-12-01 c 4 6 1 3 6
    2022-12-01 b 3 6 0 3 6
    2022-12-01 e 2 6 0 2 6
    2022-12-02 b 8 4 3 5 4
    2022-12-02 c 4 4 1 3 4
    2022-12-02 a 3 4 0 3 4
    2022-12-02 d 2 4 0 2 4
    2022-12-02 e 2 4 0 2 4
    2022-12-03 d 6 5 2 4 5
    2022-12-03 a 5 5 2 3 5
    2022-12-03 c 4 5 1 3 5
    2022-12-03 b 3 5 0 3 5
    2022-12-03 e 0 5 0 0 5
    2022-12-04 z 6 9 3 3 6
    2022-12-04 y 5 9 2 3 6
    2022-12-04 x 4 9 1 3 6
    2022-12-04 w 3 9 0 3 6
    2022-12-04 v 2 9 0 2 6

    Addendum: For SQL Server 2019, the following changes may be made to eliminate references to LEAST(), GREATEST(), and GENERATE_SERIES().

    1. the PotentialAdjust calculation can be replaced with:
            CASE WHEN StartingValue > @ThresholdValue
                THEN StartingValue - @ThresholdValue
                ELSE 0
                END AS PotentialAdjust,
    
    1. The following new CTE can be inserted immediately before the RowsPerAdjustment CTE:
    Series AS (
        -- Series of numbers from 1 to Max PotentialAdjust across all dates
        -- (Order doesn't matter 1..N is the same as N..1)
        SELECT Max(MaxPotentialAdjust) AS Adjust
        FROM DailyAdjust DA
    
        UNION ALL
    
        SELECT S.Adjust - 1 AS Adjust
        FROM Series S
        WHERE S.Adjust > 1
    ),
    
    1. The CROSS APPLY ... GENERATE_SERIES ... can now be replaced with:
        CROSS APPLY (
            SELECT S.Adjust
            FROM Series S
            WHERE S.Adjust <= DA.DailyAdjstVal    -- whichever is lower
            AND S.Adjust <= DA.MaxPotentialAdjust -- whichever is lower
        ) S
    
    1. At the end or the query, the UnitAdjustment calculation can be replaced with:
        SELECT CASE
            WHEN DA.SumPotentialAdjust <= DA.DailyAdjstVal 
            THEN OD.PotentialAdjust -- All rows (for this date) are fully adjusted
            WHEN OD.PotentialAdjust <= SA.Adjust
            THEN OD.PotentialAdjust -- This row is fully adjusted
            ELSE SA.Adjust + CASE WHEN OD.Seq < SA.ResidualAdjust THEN 1 ELSE 0 END
            END AS UnitAdjustment
    

    See this updated db<>fiddle targetting SQL Server 2019.

    I may still look at combining the RowsPerAdjustment, Adjustments, and SelectedAdjustment CTEs, but this may take another day or two.