Search code examples
sqlpostgresqlgaps-and-islands

Sum over the rows using SQL but we need to stop and start the sum at specific condition


Here is an example of the data I have and the output I want in SQL.

id date flag
a 2022-04-05 0
a 2022-04-06 1
a 2022-04-07 1
a 2022-04-08 1
a 2022-04-09 0
a 2022-04-10 0
a 2022-04-11 1
a 2022-04-12 1
a 2022-04-13 1
a 2022-04-14 1
a 2022-04-15 0
a 2022-04-16 0
b 2022-04-05 0
b 2022-04-06 1
b 2022-04-07 1
b 2022-04-08 0

Desired Output

id date flag count
a 2022-04-05 0 0
a 2022-04-06 1 1
a 2022-04-07 1 2
a 2022-04-08 1 3
a 2022-04-09 0 0
a 2022-04-10 0 0
a 2022-04-11 1 1
a 2022-04-12 1 2
a 2022-04-13 1 3
a 2022-04-14 1 4
a 2022-04-15 0 0
a 2022-04-16 0 0
b 2022-04-05 0 0
b 2022-04-06 1 1
b 2022-04-07 1 2
b 2022-04-08 0 0

Basically the increment should start if the value of flag is 1 and continue incrementing until a flag of 0 is reached, then continue incrementing from the next flag of 1 until the next 0, and so on.


Solution

  • This is a gaps and islands problem. One approach uses the difference in row numbers method:

    WITH cte AS (
        SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) rn1,
                  ROW_NUMBER() OVER (PARTITION BY id, flag ORDER BY date) rn2
        FROM yourTable
    )
    
    SELECT id, date, flag,
           SUM(flag) OVER (PARTITION BY id, flag, rn1 - rn2 ORDER BY date) AS count
    FROM cte
    ORDER BY id, date;