Search code examples
sqlsql-servert-sql

Subtracting values from separate source table sequentially/ running values


I am working in SQL Server. I have a scenario where I have dollar amounts from various cost transactions (which I'll call Cashflow) and a set of funding items. Each cashflow transaction and funding item is tied to a project via a column called ProjectId. There can be thousands of cashflow transactions, and I need to display which funding item(s) those cashflow transactions will draw from.

Some items to consider:

  • Each cashflow row can be funded by multiple funding items if needed
  • The DepletionOrder column specifies in what order the funding items should be used for a particular project
  • If there is still money left in a funding item after a cashflow transaction has drawn down against it, the next cashflow transaction (in order of the CashflowRowNum) should continue drawing against the same funding item

Consider the following test DLL and data:

DROP TABLE IF EXISTS #Funding
DROP TABLE IF EXISTS #Cashflow

CREATE TABLE #Funding 
(
    ProjectId BIGINT,
    DepletionOrder BIGINT,
    FundingAmount DECIMAL(32,2)
)

CREATE TABLE #Cashflow 
(
    ProjectId BIGINT,
    CashflowRowNum BIGINT,
    CashflowAmount DECIMAL(32,2),
)

INSERT INTO #Funding
VALUES 
(1, 1, 1000000),
(1, 2,10000), 
(1, 3, 400),
(1, 4, 40000),
(2, 1, 50000), 
(2, 2, 25000)

INSERT INTO #Cashflow
VALUES 
(1, 1, 999000),
(1, 2, 8000),
(1, 3, 4000),
(2, 1, 60000),
(2, 2, 20000),
(2, 3, 30000)

Here are the results I'm expecting. To be honest, I'm really just trying to figure out how to get ProjectId, CashflowRowNum, DepletionOrder and CashflowFunded all in the same query result.

Everything else I just included it for visibility:

ProjectId CashflowRowNum DepletionOrder CashflowAmount FundingAmount CashflowFunded CashflowRemainingToFund FundingRemaining
1 1 1 999000 1000000 999000 0 1000
1 2 1 8000 1000000 1000 7000 0
1 2 2 8000 10000 7000 0 3000
1 3 2 4000 10000 3000 1000 0
1 3 3 4000 400 400 600 0
1 3 4 4000 40000 600 0 39400
2 1 1 60000 50000 50000 10000 0
2 1 2 60000 20000 10000 0 10000
2 2 2 20000 60000 10000 10000 0
2 2 3 20000 5000 5000 5000 0

I saw some very similar posts like this one, and I am sure there is a solution that involves window functions and/or recursive CTEs, but I can't seem to wrap my head around how to use them for this exact scenario. I am also a bit concerned with the performance of a recursive CTE if CashflowRowNum gets in the 1000s.

Note I also do not want to use cursors or while loops. That would certainly simplify things, but I do not think it would perform well given the number of rows in the Cashflow table.

EDIT:
The comments bring up some fair points. I suppose I should have tried the cursor-based approach to see if it is actually "too slow" before writing it off.

I originally didn't post my attempts with the recursive CTE since I wasn't even getting remotely close. However, using HABO's recursive CTE approach in the post linked above, I think I am about 95% there. I have CashflowRemainingToFund and FundingRemaining calculating correctly, but I cannot figure out the proper formula for CashflowFunded which is what I ultimately need. The bottom portion of the UNION is what I can't figure out. It seems straightforward enough to conceptualize, but writing it out in in the recursive CTE has me stumped. Any ideas?

;with Cashflow as (
    SELECT *,
        SUM(CashflowAmount) OVER (PARTITION BY ProjectId ORDER BY CashflowRowNum) AS CashflowRunningTotalByProject
    FROM #Cashflow
),
Funding as (
    SELECT *,
        SUM(FundingAmount) OVER (PARTITION BY ProjectId ORDER BY DepletionOrder) AS FundingRunningTotalByProject
    FROM #Funding
),
CashflowFunding as (
    select f.ProjectId, 
        f.DepletionOrder, 
        CashflowRowNum,
        CashflowRunningTotalByProject,
        f.FundingAmount, 
        f.FundingRunningTotalByProject,
        cf.CashflowAmount,
        convert(
            decimal(32,2),
            case
                when cf.CashflowAmount IS NULL 
                    then f.FundingAmount
                when f.FundingAmount >= CashflowRunningTotalByProject
                    then f.FundingAmount - CashflowRunningTotalByProject
                when f.FundingAmount < CashflowRunningTotalByProject
                    then 0
            end
        ) as FundingRemaining,
        convert(
            decimal(32,2),
            case
                when cf.CashflowAmount IS NULL 
                    then 0
                when f.FundingAmount >= CashflowRunningTotalByProject
                    then 0
                when f.FundingAmount < CashflowRunningTotalByProject
                    then CashflowRunningTotalByProject - f.FundingAmount
            end
        ) as CashflowRemainingToFund,
        convert(
            decimal(32,2),
            case
                when cf.CashflowAmount IS NULL 
                    then null
                when f.FundingAmount >= cf.CashflowRunningTotalByProject
                    then CashflowAmount
                when f.FundingAmount < cf.CashflowRunningTotalByProject
                    then CashflowAmount + (f.FundingAmount - CashflowRunningTotalByProject)
            end
        ) as CashflowFunded
    from Funding as f 
        left join Cashflow as cf on cf.ProjectId = f.ProjectId
    where f.DepletionOrder = 1
  
    union all

    select f.ProjectId, 
        f.DepletionOrder, 
        CTE.CashflowRowNum,
        CashflowRunningTotalByProject,
        f.FundingAmount, 
        f.FundingRunningTotalByProject,
        CTE.CashflowAmount,
        convert(
            decimal(32,2),
            case
                when f.FundingAmount >= CTE.CashflowRemainingToFund then f.FundingAmount - CTE.CashflowRemainingToFund
                when f.FundingAmount < CTE.CashflowRemainingToFund then 0
            end
        ) as FundingRemaining,
        convert(
            decimal(32,2),
            case
                when CTE.FundingRemaining + f.FundingAmount >= CTE.CashflowRemainingToFund then 0
                when CTE.FundingRemaining + f.FundingAmount < CTE.CashflowRemainingToFund then CTE.CashflowRemainingToFund - CTE.FundingRemaining - f.FundingAmount
            end
        ) as CashflowRemainingToFund,

        /*this part is incorrect*/
        convert(
            decimal(32,2),
            case
                when f.FundingAmount >= CTE.CashflowRemainingToFund then CTE.CashflowRemainingToFund
                when f.FundingAmount < CTE.CashflowRemainingToFund then CTE.FundingRemaining - f.FundingAmount
            end
        ) as CashflowFunded
    from CashflowFunding as CTE 
        inner join Funding as f on f.ProjectId = CTE.ProjectId 
            and f.DepletionOrder = CTE.DepletionOrder + 1
    where CTE.CashflowRemainingToFund > 0
)
select 
    *,
    SUM(CashflowRemainingToFund) OVER (PARTITION BY ProjectId, CashflowRowNum ORDER BY DepletionOrder) AS SumCfRemaining,
    case when DepletionOrder = ( select max( DepletionOrder ) from #Funding where ProjectId = CashflowFunding.ProjectId ) 
        then FundingRemaining - CashflowRemainingToFund
        else NULL 
    end as SurplusOrDeficit
from CashflowFunding
where CashflowRemainingToFund <= CashflowAmount
order by ProjectId, DepletionOrder, CashflowRowNum

For now, I'll likely write a cursor to do it anyway, but I find this approach pretty interesting so I'd like to finish it off if possible. Thanks in advance for any help.


Solution

  • This appears to be a FIFO (First-In, First-Out) allocation problem.

    This can be solved without a loop by calculating the accumulated start/stop ranges for each source (funding), calculating the same for each sink (cash flow), and then identifying the overlaps between the source and sink ranges.

    The ranges can be calculated using a running-sum SUM(...) OVER(...) window functions. Overlaps can be identified using the condition start1 < end2 AND start2 < end1. The actual overlap range (and size) can be calculated as GREATEST(start1, start2) through LEAST(end1, end2). All calculations are done separately for each partition (ProjectId)

    WITH CummulativeFunding AS (
        SELECT
            *,
            SUM(FundingAmount) OVER(PARTITION BY ProjectId ORDER BY DepletionOrder)
                - FundingAmount
                AS FromAmount,
            SUM(FundingAmount) OVER(PARTITION BY ProjectId ORDER BY DepletionOrder)
                AS ToAmount
        FROM #Funding
    ),
    CummulativeCashFlow AS (
        SELECT
            *,
            SUM(CashflowAmount) OVER(PARTITION BY ProjectId ORDER BY CashflowRowNum)
                - CashflowAmount
                AS FromAmount,
            SUM(CashflowAmount) OVER(PARTITION BY ProjectId ORDER BY CashflowRowNum)
                AS ToAmount
        FROM #Cashflow
    )
    SELECT
        CCF.ProjectId,
        CCF.CashflowRowNum,
        CF.DepletionOrder,
        CCF.CashflowAmount,
        CF.FundingAmount,
        OVL.ToAmount - OVL.FromAmount AS CashflowFunded,
        CCF.ToAmount - OVL.ToAmount AS CashflowRemainingToFund,
        CF.ToAmount - OVL.ToAmount AS FundingRemaining
    FROM CummulativeFunding CF
    JOIN CummulativeCashFlow CCF
        ON CCF.ProjectId = CF.ProjectId
        AND CCF.FromAmount < CF.ToAmount
        AND CF.FromAmount < CCF.ToAmount
    CROSS APPLY (
        SELECT
            -- For SQL Server versions 2022 and later
            GREATEST(CF.FromAmount, CCF.FromAmount) AS FromAmount,
            LEAST(CF.ToAmount, CCF.ToAmount) AS ToAmount
            -- For earlier SQL Server versions
            --IIF(CF.FromAmount > CCF.FromAmount ,CF.FromAmount, CCF.FromAmount) AS FromAmount,
            --IIF(CF.ToAmount < CCF.ToAmount ,CF.ToAmount, CCF.ToAmount) AS ToAmount
    ) OVL
    ORDER BY CF.ProjectId, OVL.FromAmount
    

    The LEAST() and GREATEST() functions are available in SQL Server versions 2022 and later. For earlier versions, an equivalent CASE expression or IIF() function can be used.

    You might need separate logic to identify remaining funding or unfunded cash flows, if the source and sink totals differ.

    Results:

    ProjectId Cashflow
    RowNum
    Depletion
    Order
    Cashflow
    Amount
    Funding
    Amount
    Cashflow
    Funded
    Cashflow
    RemainingToFund
    Funding
    Remaining
    1 1 1 999000.00 1000000.00 999000.00 0.00 1000.00
    1 2 1 8000.00 1000000.00 1000.00 7000.00 0.00
    1 2 2 8000.00 10000.00 7000.00 0.00 3000.00
    1 3 2 4000.00 10000.00 3000.00 1000.00 0.00
    1 3 3 4000.00 400.00 400.00 600.00 0.00
    1 3 4 4000.00 40000.00 600.00 0.00 39400.00
    2 1 1 60000.00 50000.00 50000.00 10000.00 0.00
    2 1 2 60000.00 25000.00 10000.00 0.00 15000.00
    2 2 2 20000.00 25000.00 15000.00 5000.00 0.00

    Some of the CashflowFunded amounts calculated above do not match the posted desired results, but they do appear to be consistent with the sample source data.

    See this db<>fiddle for a demo.