Search code examples
sqlsql-servert-sql

How to calculate a rolling sum for a group of data


I have a set of historical data, let's say from the year 2000 stored in table A. The data structure of the table simply composes of 2 column, the date dt and item

dt,   item
1/20/2000, A1
1/20/2000, A2
1/20/2000, A3
....
6/1/2024, A1
---

How do I calculate the historic count of each item for the last 2 years?

In other word, I would like to see the results like this:

202406, A1, Count of A1 until 6/1/2024 since 1/1/2000
202406, A2, Count of A2 until 6/1/2024 since 1/1/2000
202406, A3, Count of A3 until 6/1/2024 since 1/1/2000
202405, A1, Count of A1 until 5/1/2024 since 1/1/2000
202405, A2, Count of A2 until 5/1/2024 since 1/1/2000
202405, A3, Count of A3 until 5/1/2024 since 1/1/2000
...
202206, A1, Count of A1 until 6/1/2022 since 1/1/2000
202206, A2, Count of A2 until 6/1/2022 since 1/1/2000
202206, A3, Count of A3 until 6/1/2022 since 1/1/2000

I can easily group and count the item in a specific month using this:

select  202406,  item, count(item)
from table A
where datetime < 6/1/2024
group by left(convert(varchar,dt,112),6), item

I can also list the 24 past months since today

select dt2=format(dateadd(month,2-n,getdate()),'yyyyMM')
from
(
  select top (24) n=row_number() over (order by (select null))
  from sys.object)
)

But I can not combine my 2 SQL statements above to get the result I want.

Basically, I just would like to get something like this

select x, item, count(item) from tablea where month(dt) <x group by x, item

with x running from currentmonth-24 to currentmonth.

Can it be easily achieved with 1 simple SQL statement instead of using stored procedure?


Solution

  • You can accomplish what you want by:

    1. Encapsulating a modified version of your date generator in a subselect or a CTE.
    2. Join the generated dates with your data using a date inequality condition.
    3. Group by date and item while calculating and a count.
    4. Format the results as desired.
    with Months as (
        select eomonth(getdate(), 1 - N.n) as endDate
        from (
            select top (24) row_number() over (order by (select null)) as n
            from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) N1(n) -- up to 10
            cross join (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) N2(n) -- up to 100
        ) N
    )
    select
        M.endDate,
        convert(char(6), M.endDate, 112) as endMonth, -- yyyymmdd truncated
        A.item,
        count(*) as itemCount
    from TableA A
    join Months M
        on A.dt <= M.endDate
    group by M.endDate, A.item
    order by M.endDate, A.item;
    

    The date generator uses a VALUES source instead of sys.objects, because the latter did not have enough rows in the fiddle environment. It also calculates a true end-of-month end date instead of a "YYYYMM" format. I also adjusted the range down by one month, so that it ends with the current month.

    Results

    endDate endMonth item itemCount
    2022-07-31 202207 A1 1
    2022-07-31 202207 A2 1
    2022-07-31 202207 A3 1
    ... ... ... ...
    2024-06-30 202406 A1 2
    2024-06-30 202406 A2 1
    2024-06-30 202406 A3 1

    See this db<>fiddle for a demo