This is my sample dataset. Seq_no
and value
columns are given, Cum.sum
and Bucket
columns we need to find using postgresql. To Find cumulative sum , i can write sum(value) over (order by seq_no rows between unbounded preceding and current row)
. But ask is reset cum.sum
if value reach threshold value. For example Threshold value is 20 and sum of the row 1,2 and 3 values are 23 which is greater than 20 . so we reset the window function. The next threshold value calculate from seq_no
4 onwards .
The below query recursively calculates both columns running_sum
and bucket
but all groups do not exceed 20 in sum. As 20 is a threshold, it makes sense. Or you should play with a threshold to get desirable output.
with recursive cte as (
select seq_no, value as running_sum, 1 as bucket
from data
where seq_no = 1
union all
select data.seq_no,
(case when running_sum + value > 20 then value
else running_sum + value
end),
(case when running_sum + value > 20 then bucket + 1
else bucket
end)
from cte join
data
on data.seq_no = cte.seq_no + 1
)
select *
from cte;
Anyway, you can use this query as starting point.