Search code examples
sqlpostgresqlaggregate-functionsdistinct

PostgreSQL: Select rows until sum-reduction of a single column reaches over a threshold


I would like to write a query that takes rows from an ordered table, simultaneously aggregating one column's value until said aggregated value meets a desired threshold.

An additional criteria is that the violating row which passes the threshold should be included in the query results.

I have looked for other solutions done in PostgreSQL, leading me to creating the following query:

SELECT * FROM (
    SELECT *, SUM(amount) OVER (ORDER BY amount DESC) AS running_amount
    FROM public.orders WHERE price = 0.09) AS t
WHERE t.running_amount <= 15;

The issue with this query however is that it represents a PostgreSQL window query, which skips the aggregation of a columns value over all rows if the columns value at a given row is not unique.

Window queries unfortunately do not support taking into account the consideration of distinct-valued columns.

Some alternatives I heard for still making this possible would be through creating a PostgreSQL function, though I have no idea where to start for this sort of aggregation query.

If anyone has any ideas or know-how, I would greatly appreciate it.


Solution

  • Add a unique column (primary key) to the ORDER BY clause of the window function, e.g.:

    SELECT * FROM (
        SELECT *, SUM(amount) OVER (ORDER BY amount DESC, id) AS running_amount
        FROM public.orders WHERE price = 0.09
    ) AS t
    WHERE t.running_amount <= 15;
    

    In the lack of a unique column you can use the system column ctid.


    You can use UNION ALL to get the violating row which passes the threshold, e.g.:

    WITH cte AS (
        SELECT *, SUM(amount) OVER (ORDER BY amount DESC, id) AS running_amount
        FROM public.orders 
        WHERE price = 0.09
    )
    SELECT * 
    FROM cte
    WHERE running_amount <= 15
    UNION ALL (
        SELECT * 
        FROM cte
        WHERE running_amount > 15
        LIMIT 1
    );