Search code examples
sqlpostgresqlgroup-byaggregate-functionswindow-functions

SQL: Apply an aggregate result per day using window functions


Consider a time-series table that contains three fields time of type timestamptz, balance of type numeric, and is_spent_column of type text.

The following query generates a valid result for the last day of the given interval.

SELECT
    MAX(DATE_TRUNC('DAY', (time))) as last_day,
    SUM(balance)  FILTER ( WHERE is_spent_column is NULL ) AS value_at_last_day
FROM tbl

2010-07-12 18681.800775017498741407984000

However, I am in need of an equivalent query based on window functions to report the total value of the column named balance for all the days up to and including the given date .

Here is what I've tried so far, but without any valid result:

SELECT
    DATE_TRUNC('DAY', (time)) AS daily,
    SUM(sum(balance) FILTER ( WHERE is_spent_column is NULL ) ) OVER ( ORDER BY DATE_TRUNC('DAY', (time)) ) AS total_value_per_day
FROM tbl
group by 1
order by 1 desc

2010-07-12 16050.496339044977568391974000

2010-07-11 13103.159119670350269890284000

2010-07-10 12594.525752964512456914454000

2010-07-09 12380.159588711091681327014000

2010-07-08 12178.119542536668113577014000

2010-07-07 11995.943973804127033140014000


EDIT:

Here is a sample dataset: LINK REMOVED

The running total can be computed by applying the first query above on the entire dataset up to and including the desired day. For example, for day 2009-01-31, the result is 97.13522530000000000000, or for day 2009-01-15 when we filter time as time < '2009-01-16 00:00:00' it returns 24.446144000000000000.

What I need is an alternative query that computes the running total for each day in a single query.

EDIT 2:

Thank you all so very much for your participation and support.

The reason for differences in result sets of the queries was on the preceding ETL pipelines. Sorry for my ignorance!

Below I've provided a sample schema to test the queries.

https://www.db-fiddle.com/f/veUiRauLs23s3WUfXQu3WE/2

Now both queries given above and the query given in the answer below return the same result.


Solution

  • Consider calculating running total via window function after aggregating data to day level. And since you aggregate with a single condition, FILTER condition can be converted to basic WHERE:

    SELECT daily,
           SUM(total_balance) OVER (ORDER BY daily) AS total_value_per_day
    FROM (
        SELECT
            DATE_TRUNC('DAY', (time)) AS daily,
            SUM(balance) AS total_balance
        FROM tbl
        WHERE is_spent_column IS NULL
        GROUP BY 1
    ) AS daily_agg
    ORDER BY daily