Search code examples
sqlpostgresqlwindow-functions

Postgres Rows Between Causing Duplicates


I have a simple query that calculates the sum of values for current row and 11 preceding rows. It works fine when the number of rows is greater than 12 , but when it is less than 12 the data gets duplicated to fill in the missing values.

TOTALS TABLE:

ID|Report_Month| Total
1 |2018-08-01 |5
2 |2018-09-01 |25
3 |2018-10-01  |15

Sample Code:

select distinct 
         ID,
         Report_Month,
         Total,
         sum(Total) over (partition by ID order by report_month rows between 11 preceding and current row) as Running_Total
from TOTALS_TABLE;

Expected Output:

ID|Report_Month|Total|Running_Total
1 | 2018-08-01 | 5 | 5
2 | 2018-09-01 | 25 | 30
3 | 2018-10-01 | 15 | 45

Actual Output:

1 | 2018-08-01 | 5 | 5
1 | 2018-08-01 | 5 | 10
1 | 2018-08-01 | 5 | 15
1 | 2018-08-01 | 5 | 20
2 | 2018-09-01 | 25 | 45
2 | 2018-09-01 | 25 | 70
2 | 2018-09-01 | 25 | 95
2 | 2018-09-01 | 25 | 120
3 | 2018-10-01 | 15 | 135
3 | 2018-10-01 | 15 | 150
3 | 2018-10-01 | 15 | 165
3 | 2018-10-01 | 15 | 180

Any help would be greatly appreciated, I feel like I am very close probably missing something.


Solution

  • The clue is the select distinct. This should not be necessary. If the underlying table has duplicates, you should fix that. In the meantime, you can try to adjust the query.

    I'm not sure what the right fix is. Here are two possibilities.

    If entire rows are duplicated:

    select ID, Report_Month, Total,
           sum(Total) over (partition by ID order by report_month rows between 11 preceding and current row) as Running_Total
    from (select distinct tt.*
          from TOTALS_TABLE tt
         ) tt;
    

    If the totals table has subtotals on each dy that need to be added up:

    select ID, Report_Month,
           sum(Total) as month_total,
           sum(sum(Total)) over (partition by ID order by report_month rows between 11 preceding and current row) as Running_Total
    from TOTALS_TABLE tt
    group by id, Report_Month;