Search code examples
sqlpostgresqlwindow-functionsmoving-average

Calculate moving sum/count by time condition using window function and filter PostgreSQL


I want to calculate the sum 29 previous days in the 30th-day row, I use filter and window function but the FILTER not work,

it still sums from the beginning to the end if I use:

Select *, Sum(quantity) filter (where time between time - interval '29 day' and time) over ()
from t1 

it show null column if I use:

Select *, Sum(quantity) filter (where time between time - interval '29 day' and time - interval '1 day') over ()
from t1

Data, I reduce columns for simplicity

Time        sum_quantity
2020-01-01  1
2020-01-02  2
2020-01-03  3
2020-01-04  6
    ....
2020-01-30  100

Data type: Time is date and quantity is integer

Desired result: Should have the same column as first table and add this moving sum column

Day 30 = total quantity of day 1 to day 29, for every 30 days

How to fix this


Solution

  • You want a window function with a window frame definition using range:

    select t1.*,
           sum(quantity) over (order by time
                               range between interval '29 day' preceding and current row
                              ) 
    from t1 ;
    

    EDIT:

    If you have data for all dates, you can use rows:

    select t1.*,
           sum(quantity) over (order by time
                               rows between 29 preceding and current row
                              ) 
    from t1 ;
    

    EDIT II:

    If you need to deal with missing days in older versions of Postgres that do not support range, then expanding the data is probably the simplest method:

    select t1.*,
           sum(quantity) over (order by time
                               rows between 29 preceding and current row
                               ) 
    from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
          from t1
         ) d left join
         t1
         on d.dte = t1.time;
    

    You may want to filter out the additional rows:

    select t1.*
    from (select t1.*,
                 sum(quantity) over (order by time
                                     rows between 29 preceding and current row
                                     ) as running_sum
          from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
                from t1
               ) d left join
               t1
               on d.dte = t1.time
         ) t1
    where t1.time is not null;