Given this table:
# select * from messages;
id | verbosity |
---|---|
1 | 20 |
2 | 20 |
3 | 20 |
4 | 30 |
5 | 100 |
(5 rows) I would like to select N messages for which sum of verbosity
is lower than N. So if N = 70
, the desired result will be messages with id 1, 2 and 3. It's important the solution is database-independent. It should work at least on PostgreSQL and SQLite.
Something like:
SELECT * FROM messages GROUP BY id HAVING SUM(verbosity) < 70;
doesn't sum all values from the verbosity
column.
SELECT m.id, sum(m1.verbosity) AS total
FROM messages m
JOIN messages m1 ON m1.id <= m.id
WHERE m.verbosity < 70 -- optional, to avoid pointless evaluation
GROUP BY m.id
HAVING SUM(m1.verbosity) < 70
ORDER BY total DESC
LIMIT 1;
This assumes a unique, ascending id
like you have in your example.
In modern Postgres - or generally with modern standard SQL (but not in SQLite):
WITH cte AS (
SELECT *, sum(verbosity) OVER (ORDER BY id) AS total
FROM messages
)
SELECT *
FROM cte
WHERE total < 70
ORDER BY id;
Should be faster for big tables where you only retrieve a small set.
WITH RECURSIVE cte AS (
( -- parentheses required
SELECT id, verbosity, verbosity AS total
FROM messages
ORDER BY id
LIMIT 1
)
UNION ALL
SELECT c1.id, c1.verbosity, c.total + c1.verbosity
FROM cte c
JOIN LATERAL (
SELECT *
FROM messages
WHERE id > c.id
ORDER BY id
LIMIT 1
) c1 ON c1.verbosity < 70 - c.total
WHERE c.total < 70
)
SELECT *
FROM cte
ORDER BY id;
All standard SQL, except for LIMIT
.
Strictly speaking, there is no such thing as "database-independent". There are various SQL-standards, but no RDBMS complies completely. LIMIT
works for PostgreSQL and SQLite (and some others). Use TOP 1
for SQL Server, rownum
for Oracle. Here's a comprehensive list on Wikipedia.
The SQL:2008 standard would be:
...
FETCH FIRST 1 ROWS ONLY
... which PostgreSQL supports - but hardly any other RDBMS.
The pure alternative that works with more systems would be to wrap it in a subquery and
SELECT max(total) FROM <subquery>
But that is slow and unwieldy.