Search code examples
sqlsnowflake-cloud-data-platformcumulative-sumansi-sql

Running Total to the latest period


I want to query the table so that the running total is repeatedly carrying over to the latest period as long as the value dose not fall to 0. Assuming I have a table with values as such below:

Name Period Value
A 02/2022 2
A 03/2022 5
A 04/2022 3
A 05/2022 7
B 02/2022 9
B 04/2022 6

I want my result to be:

| Name | Period | Value|
| A| 02/2022| 2 |
| A| 03/2022| 7 |
| A| 04/2022| 10|
| A| 05/2022| 17|
| B| 02/2022| 9 |
| B| 03/2022| 9 |
| B| 04/2022| 15|
| B| 05/2022| 15|

My current query is:

SELECT 
    PERIOD
    ,NAME
    ,SUM(SUM(Value)) OVER (PARTITION BY NAME ORDER BY PERIOD) AS balance
  FROM 
    table

 WHERE Period < CURRENT_DATE()
 GROUP BY 
   1
   ,2

This results in the value stopping at the latest period the activity occurred as such:

| Name | Period | Value|
| A    | 02/2022| 2    |
| A    | 03/2022| 7    |
| A    | 04/2022| 10   |
| A    | 05/2022| 17   |
| B    | 02/2022| 9    |
| B    | 04/2022| 15   |

Solution

  • OK, you haven't had an answer in a full day so even though I work in TSQL I'll try a solution that's ANSI SQL compatible. Work with me if my syntax is off a bit.

    Before we start, check your "Desired Output", you're currently showing a running total for A on 3/22 of 5, but you had a 2 for A on 2/22 so it should be a running total of 7, right?

    Anyway, assuming that's just a typo, I'd approach this by making a few CTEs that build a list of all {PERIOD, NAME} pairs you want reported, then JOIN your actual data to that. There are a number of ways to generate the dates, the easiest is to use DISTINCT if your actual data is fairly robust, but I can describe other methods if that assumption does not hold for your data.

    So with all that in mind, here is my solution. I put your sample data in a CTE for portability, just replace my "cteTabA" with whatever your data table is really named

    --Code sample data as a CTE for portability
    ;with cteTabA as ( 
        SELECT * 
        FROM ( VALUES 
            ('A', '02/2022', '2')
            , ('A', '03/2022', '5')
            , ('A', '04/2022', '3')
            , ('A', '05/2022', '7')
            , ('B', '02/2022', '9')
            , ('B', '04/2022', '6')
        ) as TabA(Name, Period, Value) 
    ) --END of sample data, actual query below
    --First, build a list of periods to use. If your data set is full, just select DISTINCT
    , cteDates as ( --but there are other ways if this doesn't work for you - let me know!
        SELECT DISTINCT Period FROM cteTabA
    ) --Next, build a list of names to report on
    , cteNames as (
        SELECT DISTINCT Name FROM cteTabA
    ) --Now build your table that has all periods for all names
    , cteRepOn as (
        SELECT * FROM cteNames CROSS JOIN cteDates 
    )--Now assemble a table that has entries for each period for each name,
    --but fill in zeroes for those you don't actually have data for
    , cteFullList as (
        SELECT L.*, COALESCE(D.Value, 0) as Value
        FROM cteRepOn as L 
            LEFT OUTER JOIN cteTabA as D on L.Name = D.Name AND L.Period = D.Period 
    )--Now your query works as expected with the gaps filled in
    SELECT PERIOD, NAME, Value 
        ,SUM(Value) OVER (PARTITION BY NAME ORDER BY PERIOD) AS balance
    FROM  cteFullList
    WHERE Period < '06/2022'--CURRENT_DATE()
    ORDER BY NAME, PERIOD
    

    This produces an output as follows

    PERIOD NAME Value balance
    02/2022 A 2 2
    03/2022 A 5 7
    04/2022 A 3 10
    05/2022 A 7 17
    02/2022 B 9 9
    03/2022 B 0 9
    04/2022 B 6 15
    05/2022 B 0 15