Search code examples
sqlpostgresqlindexingheapsortpostgresql-12

Indexing is not applying on the date column in postgres. Top-N-heapsort is applying not Index scan


I have a table 'post' which has millions of rows. I need top N records using the date field but it's taking too much time because of Top-N-heapsort. How can I optimize this query?

Note: On the staging server where I have less data it's working fine. Index scan is applying there.

DB Version: Postgres 12

    CREATE TABLE public.posts
(
    post_id integer NOT NULL DEFAULT (primary key),
    team_id integer NOT NULL,
    is_active boolean DEFAULT true,
    "createdAt" timestamp with time zone,
    "updatedAt" timestamp with time zone,
)

Index column are:

"createdAt DESC NULLS Last",
(team_id ASC NULLS LAST, "createdAt" DESC NULLS LAST),
team_id ASC NULLS LAST

QUERY:

SELECT p.*
FROM posts p
WHERE team_id = 1
AND p.is_active = true AND "createdAt" IS NOT NULL
ORDER BY "createdAt" DESC 
LIMIT 20;

Query plan:

 "Limit  (cost=138958.67..138958.72 rows=20 width=360) (actual time=356.391..356.419 rows=20 loops=1)"
"  ->  Sort  (cost=138958.67..139078.57 rows=47960 width=360) (actual time=356.389..356.402 rows=20 loops=1)"
"        Sort Key: ""createdAt"" DESC"
"        Sort Method: top-N heapsort  Memory: 34kB"
"        ->  Index Scan using posts_team_id on posts p  (cost=0.44..137682.47 rows=47960 width=360) (actual time=0.042..317.258 rows=52858 loops=1)"
"              Index Cond: (team_id = 1)"
"              Filter: (is_active AND (""createdAt"" IS NOT NULL))"
"Planning Time: 0.145 ms"
"Execution Time: 356.459 ms"

Solution

  • This is the index I would use:

    CREATE INDEX ON posts (team_id, "createdAt") WHERE is_active;
    

    It supports the WHERE condition and the ORDER BY.