Search code examples
sqldatesumaggregateaggregate-functions

Sum based on column condition


I am trying to sum sales on my table grouped for next 30 days.

Here is the data I have.

date sales
2021-08-08 35
2021-08-08 14
2021-08-11 35
2021-09-09 22
2021-09-21 44
2021-10-16 46
2021-10-25 9
2021-10-25 1
2021-10-25 2
2021-10-25 6
2021-11-04 1
2021-11-07 1

I am expecting the result like this

date sales total 30d
2021-08-08 14 84
2021-08-08 35 84
2021-08-11 35 57
2021-09-09 22 66
2021-09-21 44 90
2021-10-16 46 76
2021-10-25 9 30
2021-10-25 6 30
2021-10-25 2 30
2021-10-25 1 30
2021-11-04 1 12
2021-11-07 1 29

Explanation: For date 2021-08-08 it's going to sum the sales from date >= 2021-08-08 til 30 days after which is 2021-09-07 so it would be 14+35+35 (2021-08-08, 2021-08-08 and 2021-08-11).

I got my work around using this query, but I think it is very inefficient and I believe there's other way using windows function though I can't wrap my head around it.

with temp as (
    select date, sum(sales) as sales_total
from test_table
group by 1,2
)
, temp2 as (
SELECT a.date, SUM(b.sales_total) total
FROM temp a, temp b
WHERE b.date >= a.date AND b.date <= a.date + interval '30' day
GROUP BY a.date
)
select a.date, a.sales, b.total 
from test_table a 
JOIN temp2 b on a.date = b.date

Any pointer to solve this problem using windows function or any other more efficient way is appreciated. Thanks!


Solution

  • Here is a way to do this using recursive cte.

    with data
       as (select start_dt,sales,row_number() over(order by start_dt) as rnk       
             from t1
           )
         ,cte_data
        as ( select d.start_dt,d.sales,d.rnk,d.start_dt as grp_dt
               from data d
              where rnk=1
            union all
            select d.start_dt,d.sales,d.rnk,case when datediff(day,d2.grp_dt,d.start_dt)>30 then 
                                               d.start_dt
                                         else d2.grp_dt
                                     end
              from cte_data d2
              join data d
                on d.rnk=d2.rnk+1
           )
      select min(start_dt) start_range,max(start_dt) end_range,sum(sales) as sum_sales
        from cte_data
     group by grp_dt
    
    +-------------+------------+-----------+
    | start_range | end_range  | sum_sales |
    +-------------+------------+-----------+
    | 2021-08-08  | 2021-08-11 |        84 |
    | 2021-09-09  | 2021-09-21 |        66 |
    | 2021-10-16  | 2021-11-07 |        66 |
    +-------------+------------+-----------+
    

    sql fiddle link https://dbfiddle.uk/2LmQe5n5

    To elaborate a bit on this. We would need some kind of recursive process to do this, as the results of the following row depends on the previous entry date, and so I doubt we could do this using window functions.