I have a column a_increment
and I want to use it to calculate a column a_cumulative
, which is the sum of all earlier a_increment
, including the current row. "Earlier" as defined by the timestamp
column "DateTime"
.
I have tried several approaches, for example an iterative approach where I calculate the new a_cumulative
as the previous a_cumulative
+ the current a_increment
. However, this was done backwards in time, which does not yield the correct result.
I found that a window function could yield the correct result, however, window functions are not allowed in an update of the table.
My intuition is that using an iterative approach would be fastest, but I don't know how to force the iteration to go forward.
Here is what I expect:
ID "DateTime" a_increment a_cumulative
1 "2024-05-17 09:00:04+02" 1 1
2 "2024-05-17 09:00:14+02" 2 3
3 "2024-05-17 09:00:24+02" 0 3
4 "2024-05-17 09:00:34+02" 5 8
This is the code for brute-forcing (it works, but it is very inefficient, took more than 5 days to run):
UPDATE data_cumulative_sums t
SET a_cumulative = COALESCE((SELECT SUM(a_increment) FROM "Data" WHERE "DateTime" <= t."DateTime"), 0);
This is where I tried to do it iteratively:
UPDATE data_cumulative_sums t
SET a_cumulative = COALESCE((SELECT COALESCE(t2.a_cumulative, 0) FROM data_cumulative_sums t2 WHERE t2.id = t.id - 1), 0) + t.a_increment
This returns:
ID "DateTime" a_increment a_cumulative
1 "2024-05-17 09:00:04+02" 1 1
2 "2024-05-17 09:00:14+02" 2 2
3 "2024-05-17 09:00:24+02" 0 0
4 "2024-05-17 09:00:34+02" 5 5
If I run it again, it returns:
ID "DateTime" a_increment a_cumulative
1 "2024-05-17 09:00:04+02" 1 1
2 "2024-05-17 09:00:14+02" 2 3
3 "2024-05-17 09:00:24+02" 0 2
4 "2024-05-17 09:00:34+02" 5 5
If I run it as many times as I have rows, the expected result is achieved (but that would be too slow, of course)
I have found the window function:
Select sum(a_increment) over (order by "DateTime")
from data_cumulative_sums
which returns the expected result. However, when I try:
UPDATE data_cumulative_sums t
SET a_cumulative =sum(a_increment) OVER (ORDER BY "DateTime");
I get the error:
window functions are not allowed in UPDATE
... however, window functions are not allowed in an update of the table.
That's almost, but not quite true. (That error message might be more specific.) You can still use window functions in a subquery or CTE, and then self-join. Like:
UPDATE "Data" t
SET a_cumulative = COALESCE(u.a_cumu, 0)
FROM (
SELECT id, SUM(a_increment) OVER (ORDER BY "DateTime") AS a_cumu
FROM "Data"
) u
WHERE u.id = t.id;
id
being the primary key of the table. You can use any unique, not-null (combination of) column(s).
If there can be duplicate values in "DateTime"
you have to define how to deal with those exactly. This query includes all peers in a_cumulative
.