Search code examples
sqlamazon-redshiftwindow-functionscumulative-sum

Conditional running total


I have an interesting thing that I have to do in Redshift. Say, I have a table like this:

index,total,delta
0,3,null
1,5,2
2,10,5
3,11,1
4,null,4
5,null,6
6,null,2

where delta is the difference between each total and the previous row's total. Here, my deltas come from a different source than my totals, so it is possible to receive deltas without receiving updated totals.

How can I impute the totals according to the deltas? Like so:

index,total,delta
0,3,null
1,5,2
2,10,5
3,11,1
4,15,4
5,21,6
6,23,2

I'm messing around somewhere in the vicinity of NVL(total, LAST_VALUE(total IGNORE NULLS) OVER (ORDER BY index ROWS UNBOUNDED PRECEDING) + SUM(delta) OVER (ORDER BY index ROWS UNBOUNDED PRECEDING), but that doesn't quite do it - I only want to SUM the deltas for which there are no corresponding total.


Solution

  • You could use:

    -- creating subgroups
    WITH cte AS (
      SELECT *, SUM(total IS NOT NULL::int) OVER(ORDER BY index) s
      FROM tab
    )
    SELECT index, 
        CASE WHEN total IS NULL 
             THEN SUM(COALESCE(total,0) + delta) OVER(PARTITION BY s ORDER BY index)
                 -FIRST_VALUE(delta) OVER(PARTITION BY s ORDER BY index)
                 -- running total starting from first not null total + delta
                 -- decreased by first delta
             ELSE total
        END AS total
       ,delta
    FROM cte
    ORDER BY index;
    

    db<>fiddle demo

    Output:

    ┌────────┬────────┬───────┐
    │ index  │ total  │ delta │
    ├────────┼────────┼───────┤
    │     0  │     3  │       │
    │     1  │     5  │     2 │
    │     2  │    10  │     5 │
    │     3  │    11  │     1 │
    │     4  │    15  │     4 │
    │     5  │    21  │     6 │
    │     6  │    23  │     2 │
    └────────┴────────┴───────┘
    

    EDIT:

    Actually there is no need for FIRST_VALUE:

    WITH cte AS (
      SELECT *, SUM(total IS NOT NULL::int) OVER(ORDER BY index) s
      FROM tab
    )
    SELECT index, 
       CASE WHEN total IS NULL 
        THEN SUM(COALESCE(total,0) 
           + CASE WHEN total IS NOT NULL THEN 0 ELSE delta END) 
            OVER(PARTITION BY s ORDER BY index)
        ELSE total
       END AS total
      ,delta
    FROM cte
    ORDER BY index;
    

    db<>fiddle demo2