Search code examples
sqlpostgresqlgroup-bysumwindow-functions

PostgresSQL Query to Include YTD Totals like SUMIFS?


Thanks in advance for any assistance. Writing queries is definitely my biggest weakness.

I have a query written to get data on transactions for each client on each day:

select organization, 
       DATE(txn_date), 
       sum(amount) as totalamount 
from sales_receipt 
where EXTRACT(YEAR from txn_date) > 2017 
group by organization, DATE(txn_date)

This yields something like:

organization date        totalamount
client1      2018-01-01  1000.00  
client1      2018-01-02  500.00    
client1      2018-01-03  800.00  
client2      2018-01-01  600.00  
client2      2018-01-02  300.00    
client2      2018-01-03  200.00
client1      2019-01-01  800.00
client1      2019-01-02  600.00    
client1      2019-01-03  900.00
client2      2019-01-01  400.00  
client2      2019-01-02  600.00
client2      2019-01-03  800.00

I'd like to add an additional column that includes the year to date sum of totalamount for each client if possible, something like:

organization date        totalamount ytd 
client1      2018-01-01  1000.00     1000.00  
client1      2018-01-02  500.00      1500.00    
client1      2018-01-03  800.00      2300.00
client2      2018-01-01  600.00      600.00  
client2      2018-01-02  300.00      900.00
client2      2018-01-03  200.00      200.00
client1      2019-01-01  800.00      8000.00  
client1      2019-01-02  600.00      1400.00    
client1      2019-01-03  900.00      2300.00
client2      2019-01-01  400.00      400.00  
client2      2019-01-02  600.00      1000.00
client2      2019-01-03  800.00      1800.00

Is there a way to do this in the PostgreSQL query?


Solution

  • I would write this as:

    select 
        organization, 
        txn_date::date txn_date, 
        sum(amount) as totalamount,
        sum(sum(amount)) over(partition by organization, extract(year from txn_date) order by date) ytd
    from sales_receipt 
    where txn_date >= '2018-01-01'::date
    group by organization, txn_date::date
    

    You don't need an extra level of nesting to use window functions.

    Also, you seem to want a separate partition for each organization and year.

    Finally, you can use a direct predicate against the date column rather than apply a function on it (this makes the query much more efficient, especially if an index is available).