I have the following table, for which I am trying to calculate a running balance, and remaining value, but the remaining value is the function of the previously calculated row, as such:
date PR amount total balance remaining_value
----------------------------------------------------------
'2020-1-1' 1 1.0 100.0 1.0 100 -- 100 (inital total)
'2020-1-2' 1 2.0 220.0 3.0 320 -- 100 (previous row) + 220
'2020-1-3' 1 -1.5 -172.5 1.5 160 -- 320 - 160 (see explanation 1)
'2020-1-4' 1 3.0 270.0 4.5 430 -- 160 + 270
'2020-1-5' 1 1.0 85.0 5.5 515 -- 430 + 85
'2020-1-6' 1 2.0 202.0 7.5 717 -- 575 + 202
'2020-1-7' 1 -4.0 -463.0 3.5 334.6 -- 717 - 382.4 (see explanation 2)
'2020-1-8' 1 -0.5 -55.0 3.0 ...
'2020-1-9' 1 2.0 214.0 5.0
'2020-1-1' 2 1.0 100 1.0 100 -- different PR: start new running total
The logic is as follows:
For positive amount rows, the remaining value is simply the value from the previous row in column remaining_value
+ the value in column total
from that row.
For negative amount rows, it gets tickier:
Explanation 1: We start with 320
(previous row balance) and from it we remove 1.5/3.0
(absolute value of current row amount divided by previous row balance) and we multiply it by the previous row remaining_value
, which is 320
. The calculation gives:
320 - (1.5/3 * 320) = 160
Explanation 2: Same logic as above. 717 - (4/7.5 * 717) = 717 - 382.4
4/7.5
here represents the current row's absolute amount divided by the previous row's balance.
I tried the window function sum()
but did not manage to get the desired result. Is there a way to get this done in PostgreSQL without having to resort to a loop?
Extra complexity: There are multiple products identified by PR (product id), 1, 2 etc. Each need their own running total and calculation.
You could create a custom aggregate function:
CREATE OR REPLACE FUNCTION f_special_running_sum (_state numeric, _total numeric, _amount numeric, _prev_balance numeric)
RETURNS numeric
LANGUAGE sql IMMUTABLE AS
'SELECT CASE WHEN _amount > 0 THEN _state + _total
ELSE _state * (1 + _amount / _prev_balance) END';
CREATE OR REPLACE AGGREGATE special_running_sum (_total numeric, _amount numeric, _prev_balance numeric) (
sfunc = f_special_running_sum
, stype = numeric
, initcond = '0'
);
The CASE
expression does the split: If amount is positive, just add total, else apply your (simplified) formula:
320 * (1 + -1.5 / 3.0)
instead of 320 - (1.5/3 * 320)
, i.e.:
_state * (1 + _amount / _prev_balance)
Function and aggregate parameter names are only for documentation.
Then your query can look like this:
SELECT *
, special_running_sum(total, amount, prev_balance) OVER (PARTITION BY pr ORDER BY date)
FROM (
SELECT pr, date, amount, total
, lag(balance, 1, '1') OVER (PARTITION BY pr ORDER BY date) AS prev_balance
FROM tbl
) t;
db<>fiddle here
We need a subquery to apply the first window function lag()
and fetch the previous balance into the current row (prev_balance
). I default to 1
if there is no previous row to avoid NULL
values.
Caveats:
If the first row has a negative total, the result is undefined. My aggregate function defaults to 0
.
You did not declare data types, nor requirements regarding precision. I assume numeric
and aim for maximum precision. The calculation with numeric
is precise. But your formula produces fractional decimal numbers. Without rounding, there will be a lot of fractional digits after a couple of divisions, and the calculation will quickly degrade in performance. You'll have to strike a compromise between precision and performance. For example, doing the same with double precision
has constant performance.
Related: