customer txn_date tag running_sum
A 1-Jan-17 1 1
A 2-Jan-17 1 2
A 3-Jan-17 1 3
A 4-Jan-17 1 4
A 5-Jan-17 1 5
A 6-Jan-17 1 6
A 7-Jan-17 0 0
A 8-Jan-17 1 1
A 9-Jan-17 1 2
A 10-Jan-17 1 3
A 11-Jan-17 0 0
A 12-Jan-17 0 0
A 13-Jan-17 1 1
A 14-Jan-17 1 2
A 15-Jan-17 0 0
How to get the running_sum and and reset the running_sum to zero if tag=0? Just like on the sample above. TIA
What you need to do is to create "groups" for each section of your 1s and 0s. You can do this by creating a boolean flag and then cumulatively summing over that column to get groups. From there you can cumulatively sum over your original tag
column by each group you created in the sub-query.
Query:
SELECT customer
, txn_date
, tag
, SUM(tag) OVER (PARTITION BY customer, flg_sum ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_sum
FROM (
SELECT *
, SUM(tag_flg) OVER (PARTITION BY customer ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS flg_sum
FROM (
SELECT *
, CASE WHEN tag = 1 THEN 0 ELSE 1 END AS tag_flg
FROM database.table ) x ) y
Output:
customer txn_date tag running_sum
A 2017-01-01 1 1
A 2017-01-02 1 2
A 2017-01-03 1 3
A 2017-01-04 1 4
A 2017-01-05 1 5
A 2017-01-06 1 6
A 2017-01-07 0 0
A 2017-01-08 1 1
A 2017-01-09 1 2
A 2017-01-10 1 3
A 2017-01-11 0 0
A 2017-01-12 0 0
A 2017-01-13 1 1
A 2017-01-14 1 2
A 2017-01-15 0 0