I am working in SQL Server. I have a scenario where I have dollar amounts from various cost transactions (which I'll call Cashflow
) and a set of funding items. Each cashflow transaction and funding item is tied to a project via a column called ProjectId
. There can be thousands of cashflow transactions, and I need to display which funding item(s) those cashflow transactions will draw from.
Some items to consider:
DepletionOrder
column specifies in what order the funding items should be used for a particular projectCashflowRowNum
) should continue drawing against the same funding itemConsider the following test DLL and data:
DROP TABLE IF EXISTS #Funding
DROP TABLE IF EXISTS #Cashflow
CREATE TABLE #Funding
(
ProjectId BIGINT,
DepletionOrder BIGINT,
FundingAmount DECIMAL(32,2)
)
CREATE TABLE #Cashflow
(
ProjectId BIGINT,
CashflowRowNum BIGINT,
CashflowAmount DECIMAL(32,2),
)
INSERT INTO #Funding
VALUES
(1, 1, 1000000),
(1, 2,10000),
(1, 3, 400),
(1, 4, 40000),
(2, 1, 50000),
(2, 2, 25000)
INSERT INTO #Cashflow
VALUES
(1, 1, 999000),
(1, 2, 8000),
(1, 3, 4000),
(2, 1, 60000),
(2, 2, 20000),
(2, 3, 30000)
Here are the results I'm expecting. To be honest, I'm really just trying to figure out how to get ProjectId
, CashflowRowNum
, DepletionOrder
and CashflowFunded
all in the same query result.
Everything else I just included it for visibility:
ProjectId | CashflowRowNum | DepletionOrder | CashflowAmount | FundingAmount | CashflowFunded | CashflowRemainingToFund | FundingRemaining |
---|---|---|---|---|---|---|---|
1 | 1 | 1 | 999000 | 1000000 | 999000 | 0 | 1000 |
1 | 2 | 1 | 8000 | 1000000 | 1000 | 7000 | 0 |
1 | 2 | 2 | 8000 | 10000 | 7000 | 0 | 3000 |
1 | 3 | 2 | 4000 | 10000 | 3000 | 1000 | 0 |
1 | 3 | 3 | 4000 | 400 | 400 | 600 | 0 |
1 | 3 | 4 | 4000 | 40000 | 600 | 0 | 39400 |
2 | 1 | 1 | 60000 | 50000 | 50000 | 10000 | 0 |
2 | 1 | 2 | 60000 | 20000 | 10000 | 0 | 10000 |
2 | 2 | 2 | 20000 | 60000 | 10000 | 10000 | 0 |
2 | 2 | 3 | 20000 | 5000 | 5000 | 5000 | 0 |
I saw some very similar posts like this one, and I am sure there is a solution that involves window functions and/or recursive CTEs, but I can't seem to wrap my head around how to use them for this exact scenario. I am also a bit concerned with the performance of a recursive CTE if CashflowRowNum
gets in the 1000s.
Note I also do not want to use cursors or while loops. That would certainly simplify things, but I do not think it would perform well given the number of rows in the Cashflow
table.
EDIT:
The comments bring up some fair points. I suppose I should have tried the cursor-based approach to see if it is actually "too slow" before writing it off.
I originally didn't post my attempts with the recursive CTE since I wasn't even getting remotely close. However, using HABO's recursive CTE approach in the post linked above, I think I am about 95% there. I have CashflowRemainingToFund
and FundingRemaining
calculating correctly, but I cannot figure out the proper formula for CashflowFunded
which is what I ultimately need. The bottom portion of the UNION is what I can't figure out. It seems straightforward enough to conceptualize, but writing it out in in the recursive CTE has me stumped. Any ideas?
;with Cashflow as (
SELECT *,
SUM(CashflowAmount) OVER (PARTITION BY ProjectId ORDER BY CashflowRowNum) AS CashflowRunningTotalByProject
FROM #Cashflow
),
Funding as (
SELECT *,
SUM(FundingAmount) OVER (PARTITION BY ProjectId ORDER BY DepletionOrder) AS FundingRunningTotalByProject
FROM #Funding
),
CashflowFunding as (
select f.ProjectId,
f.DepletionOrder,
CashflowRowNum,
CashflowRunningTotalByProject,
f.FundingAmount,
f.FundingRunningTotalByProject,
cf.CashflowAmount,
convert(
decimal(32,2),
case
when cf.CashflowAmount IS NULL
then f.FundingAmount
when f.FundingAmount >= CashflowRunningTotalByProject
then f.FundingAmount - CashflowRunningTotalByProject
when f.FundingAmount < CashflowRunningTotalByProject
then 0
end
) as FundingRemaining,
convert(
decimal(32,2),
case
when cf.CashflowAmount IS NULL
then 0
when f.FundingAmount >= CashflowRunningTotalByProject
then 0
when f.FundingAmount < CashflowRunningTotalByProject
then CashflowRunningTotalByProject - f.FundingAmount
end
) as CashflowRemainingToFund,
convert(
decimal(32,2),
case
when cf.CashflowAmount IS NULL
then null
when f.FundingAmount >= cf.CashflowRunningTotalByProject
then CashflowAmount
when f.FundingAmount < cf.CashflowRunningTotalByProject
then CashflowAmount + (f.FundingAmount - CashflowRunningTotalByProject)
end
) as CashflowFunded
from Funding as f
left join Cashflow as cf on cf.ProjectId = f.ProjectId
where f.DepletionOrder = 1
union all
select f.ProjectId,
f.DepletionOrder,
CTE.CashflowRowNum,
CashflowRunningTotalByProject,
f.FundingAmount,
f.FundingRunningTotalByProject,
CTE.CashflowAmount,
convert(
decimal(32,2),
case
when f.FundingAmount >= CTE.CashflowRemainingToFund then f.FundingAmount - CTE.CashflowRemainingToFund
when f.FundingAmount < CTE.CashflowRemainingToFund then 0
end
) as FundingRemaining,
convert(
decimal(32,2),
case
when CTE.FundingRemaining + f.FundingAmount >= CTE.CashflowRemainingToFund then 0
when CTE.FundingRemaining + f.FundingAmount < CTE.CashflowRemainingToFund then CTE.CashflowRemainingToFund - CTE.FundingRemaining - f.FundingAmount
end
) as CashflowRemainingToFund,
/*this part is incorrect*/
convert(
decimal(32,2),
case
when f.FundingAmount >= CTE.CashflowRemainingToFund then CTE.CashflowRemainingToFund
when f.FundingAmount < CTE.CashflowRemainingToFund then CTE.FundingRemaining - f.FundingAmount
end
) as CashflowFunded
from CashflowFunding as CTE
inner join Funding as f on f.ProjectId = CTE.ProjectId
and f.DepletionOrder = CTE.DepletionOrder + 1
where CTE.CashflowRemainingToFund > 0
)
select
*,
SUM(CashflowRemainingToFund) OVER (PARTITION BY ProjectId, CashflowRowNum ORDER BY DepletionOrder) AS SumCfRemaining,
case when DepletionOrder = ( select max( DepletionOrder ) from #Funding where ProjectId = CashflowFunding.ProjectId )
then FundingRemaining - CashflowRemainingToFund
else NULL
end as SurplusOrDeficit
from CashflowFunding
where CashflowRemainingToFund <= CashflowAmount
order by ProjectId, DepletionOrder, CashflowRowNum
For now, I'll likely write a cursor to do it anyway, but I find this approach pretty interesting so I'd like to finish it off if possible. Thanks in advance for any help.
This appears to be a FIFO (First-In, First-Out) allocation problem.
This can be solved without a loop by calculating the accumulated start/stop ranges for each source (funding), calculating the same for each sink (cash flow), and then identifying the overlaps between the source and sink ranges.
The ranges can be calculated using a running-sum SUM(...) OVER(...)
window functions. Overlaps can be identified using the condition start1 < end2 AND start2 < end1
. The actual overlap range (and size) can be calculated as GREATEST(start1, start2)
through LEAST(end1, end2)
. All calculations are done separately for each partition (ProjectId
)
WITH CummulativeFunding AS (
SELECT
*,
SUM(FundingAmount) OVER(PARTITION BY ProjectId ORDER BY DepletionOrder)
- FundingAmount
AS FromAmount,
SUM(FundingAmount) OVER(PARTITION BY ProjectId ORDER BY DepletionOrder)
AS ToAmount
FROM #Funding
),
CummulativeCashFlow AS (
SELECT
*,
SUM(CashflowAmount) OVER(PARTITION BY ProjectId ORDER BY CashflowRowNum)
- CashflowAmount
AS FromAmount,
SUM(CashflowAmount) OVER(PARTITION BY ProjectId ORDER BY CashflowRowNum)
AS ToAmount
FROM #Cashflow
)
SELECT
CCF.ProjectId,
CCF.CashflowRowNum,
CF.DepletionOrder,
CCF.CashflowAmount,
CF.FundingAmount,
OVL.ToAmount - OVL.FromAmount AS CashflowFunded,
CCF.ToAmount - OVL.ToAmount AS CashflowRemainingToFund,
CF.ToAmount - OVL.ToAmount AS FundingRemaining
FROM CummulativeFunding CF
JOIN CummulativeCashFlow CCF
ON CCF.ProjectId = CF.ProjectId
AND CCF.FromAmount < CF.ToAmount
AND CF.FromAmount < CCF.ToAmount
CROSS APPLY (
SELECT
-- For SQL Server versions 2022 and later
GREATEST(CF.FromAmount, CCF.FromAmount) AS FromAmount,
LEAST(CF.ToAmount, CCF.ToAmount) AS ToAmount
-- For earlier SQL Server versions
--IIF(CF.FromAmount > CCF.FromAmount ,CF.FromAmount, CCF.FromAmount) AS FromAmount,
--IIF(CF.ToAmount < CCF.ToAmount ,CF.ToAmount, CCF.ToAmount) AS ToAmount
) OVL
ORDER BY CF.ProjectId, OVL.FromAmount
The LEAST()
and GREATEST()
functions are available in SQL Server versions 2022 and later. For earlier versions, an equivalent CASE
expression or IIF()
function can be used.
You might need separate logic to identify remaining funding or unfunded cash flows, if the source and sink totals differ.
Results:
ProjectId | Cashflow RowNum |
Depletion Order |
Cashflow Amount |
Funding Amount |
Cashflow Funded |
Cashflow RemainingToFund |
Funding Remaining |
---|---|---|---|---|---|---|---|
1 | 1 | 1 | 999000.00 | 1000000.00 | 999000.00 | 0.00 | 1000.00 |
1 | 2 | 1 | 8000.00 | 1000000.00 | 1000.00 | 7000.00 | 0.00 |
1 | 2 | 2 | 8000.00 | 10000.00 | 7000.00 | 0.00 | 3000.00 |
1 | 3 | 2 | 4000.00 | 10000.00 | 3000.00 | 1000.00 | 0.00 |
1 | 3 | 3 | 4000.00 | 400.00 | 400.00 | 600.00 | 0.00 |
1 | 3 | 4 | 4000.00 | 40000.00 | 600.00 | 0.00 | 39400.00 |
2 | 1 | 1 | 60000.00 | 50000.00 | 50000.00 | 10000.00 | 0.00 |
2 | 1 | 2 | 60000.00 | 25000.00 | 10000.00 | 0.00 | 15000.00 |
2 | 2 | 2 | 20000.00 | 25000.00 | 15000.00 | 5000.00 | 0.00 |
Some of the CashflowFunded
amounts calculated above do not match the posted desired results, but they do appear to be consistent with the sample source data.
See this db<>fiddle for a demo.