I have a users table with columns (id, xp, ...) and around 1.5 million rows.
I am getting someone's position in the XP leaderboard with the following query (which took 33 seconds to execute):
EXPLAIN ANALYZE WITH counts AS (
SELECT DISTINCT
id,
ROW_NUMBER () OVER (ORDER BY xp DESC)
FROM
users
) SELECT
*
FROM
counts
WHERE
id=1;
Subquery Scan on counts (cost=344492.80..395160.57 rows=7404 width=16) (actual time=30683.244..32174.117 rows=1 loops=1)
Filter: (counts.id = '1'::bigint)
Rows Removed by Filter: 1481060
-> HashAggregate (cost=344492.80..376651.79 rows=1480702 width=24) (actual time=30679.440..32034.921 rows=1481061 loops=1)
Group Key: users.id, row_number() OVER (?)"
Planned Partitions: 64 Batches: 65 Memory Usage: 4369kB Disk Usage: 125960kB
-> WindowAgg (cost=212155.06..238067.34 rows=1480702 width=24) (actual time=2983.137..20302.548 rows=1481061 loops=1)
-> Sort (cost=212155.06..215856.81 rows=1480702 width=16) (actual time=2983.082..5040.782 rows=1481061 loops=1)
Sort Key: users.xp DESC
Sort Method: external merge Disk: 37760kB
-> Seq Scan on users (cost=0.00..35094.02 rows=1480702 width=16) (actual time=25.467..880.626 rows=1481061 loops=1)
Planning Time: 2.593 ms
JIT:
Functions: 14
Options: Inlining false, Optimization false, Expressions true, Deforming true"
Timing: Generation 12.061 ms, Inlining 0.000 ms, Optimization 1.503 ms, Emission 26.086 ms, Total 39.650 ms"
Execution Time: 32325.206 ms
My table definition:
CREATE TABLE users
(
id bigint NOT NULL
CONSTRAINT users_pkey
PRIMARY KEY,
xp bigint DEFAULT 0 NOT NULL,
...
);
CREATE INDEX user_xp_leaderboard_index
ON users (xp DESC, id ASC);
But it is extremely slow. Although it is not surprising considering it sorts the whole table and filter it, I don't know how I could improve/optimize this query.
I did SET work_mem TO '1 GB';
. It helped a bit but not much.
Any help would be appreciated. Thanks in advance.
You can write the query like this:
select count(*)
from users u
where u.xp >= (select u2.xp from users u2 where u2.id = 1);
This can take advantage of an index on users(id, xp)
. This should totally eliminate any sorting. An index on users(xp)
could also be helpful if the rows are quite wide and Postgres can use an index-only scan.