Search code examples
sqlsql-servert-sqldbt

Cumulative sum of all preceding dates in SQL


I have an inventory movement table from our ERP, and to calculate stock on hand at a specific date (X), the calculation is the sum of all Quantity where Date is less than or equal to X.

(Date is DDMMYYYY in the below example)

raw.InventoryTable -

Date Quantity Item Value
01/01/2022 5 WidgetA 1
01/01/2023 -5 WidgetA -1
31/01/2023 5 WidgetA 1
01/01/2022 5 WidgetB 5
01/02/2023 -1 WidgetB -5
31/02/2023 1 WidgetB 5

So if I want to know inventory quantity as at end of month January 2023, it is the sum of quantity throughout all preceding time, e.g.

select 
      sum(i.quantity) as [On Hand]
    , sum(i.Value) as [Inventory Value]
    , i.Item
from raw.InventoryTable i
where Date <= '2022-01-31'
group by 
    i.item

What I want to build is an aggregate table for the end of each month (for instance just for this year), so we would have pre-agg'd stock on hand calculations and from a reporting perspective, we only have to select data from the relevant month and not worry about aggregation.

Desired output:

int.agg_Inventory_by_month

Date AggQuantity Item AggValue
31/01/2023 5 WidgetA 1
31/01/2023 5 WidgetB 5
31/02/2023 5 WidgetA 1
31/02/2023 1 WidgetB 5

It's easier to perform time intelligence calculations in reporting functions with this kind of pre-agg.

I could build the aggregation in Python by holding a start date, iterating through and summing in a loop, but would ideally like to contain transformation logic exclusively in SQL (either SQL or SQL+Jinja via dbt).


Solution

  • If I'm understanding the question correctly, it looks like you can join with dbt_utils.date_spine() to aggregate. Here's an example model using Snowflake syntax. You can swap out the SQL functions for the SQL Server equivalent and replace the start/end date with whatever you want to establish the range.

    {{
      config({
        "materialized": 'table',
      })
    }}
    
    WITH DATE_RANGE AS (
      SELECT DATE_MONTH
      FROM ({{ dbt_utils.date_spine(datepart="month", start_date="TO_DATE('2022-01-01', 'YYYY-MM-DD')", end_date="TO_DATE('2022-12-31', 'YYYY-MM-DD')") }})
    ),
    INVENTORY AS (
      SELECT * FROM {{ ref('inventory') }}
    ),
    AGG AS (
      SELECT DR.DATE_MONTH, SUM(I.QUANTITY)
      FROM INVENTORY I
      JOIN DATE_RANGE DR ON DATE_TRUNC('MONTH', I.DATE) <= DR.DATE_MONTH
      GROUP BY 1
    )
    
    SELECT * FROM AGG