database postgresql indexing query-performance query-planner

Postgres using primary_key index in almost every query

We're upgrading our postgres database from version 9.3.14 to 9.4.9. We're currently under testing phase. We've encountered a problem while testing which leads to high CPU usages when the database is updated to 9.4.9. There are queries where Postgres 9.4 is using primary_key_index while cheaper options are there. As for example, running explain analyze for below query:

SELECT  a.id as a_id, b.col_id as col_id
FROM a
INNER JOIN b ON b.id = a.b_id
WHERE (a.col_text = 'pqrs' AND a.col_int = 1)
ORDER BY a.id ASC LIMIT 1

gives this:

Limit  (cost=0.87..4181.94 rows=1 width=8) (actual time=93014.991..93014.992 rows=1 loops=1)
 ->  Nested Loop  (cost=0.87..1551177.78 rows=371 width=8) (actual time=93014.990..93014.990 rows=1 loops=1)
       ->  Index Scan using a_pkey on a  (cost=0.43..1548042.20 rows=371 width=8) (actual time=93014.968..93014.968 rows=1 loops=1)
             Filter: ((col_int = 1) AND ((col_text)::text = 'pqrs'::text))
             Rows Removed by Filter: 16114217
       ->  Index Scan using b_pkey on b  (cost=0.43..8.44 rows=1 width=8) (actual time=0.014..0.014 rows=1 loops=1)
             Index Cond: (id = a.b_id)
Planning time: 0.291 ms
Execution time: 93015.041 ms

While the query plan for the same query in 9.3.14 gives this:

Limit  (cost=17.06..17.06 rows=1 width=8) (actual time=5.066..5.067 rows=1 loops=1)
 ->  Sort  (cost=17.06..17.06 rows=1 width=8) (actual time=5.065..5.065 rows=1 loops=1)
       Sort Key: a.id
       Sort Method: quicksort  Memory: 25kB
       ->  Nested Loop  (cost=1.00..17.05 rows=1 width=8) (actual time=5.047..5.049 rows=1 loops=1)
             ->  Index Scan using index_a_on_col_text on a  (cost=0.56..8.58 rows=1 width=8) (actual time=3.154..3.155 rows=1 loops=1)
                   Index Cond: ((col_text)::text = 'pqrs'::text)
                   Filter: (col_int = 1)
             ->  Index Scan using b_pkey on b  (cost=0.43..8.46 rows=1 width=8) (actual time=1.888..1.889 rows=1 loops=1)
                   Index Cond: (id = a.b_id)
Total runtime: 5.112 ms

If I remove the ORDER BY clause from the query, then the query works fine using proper index. I can understand that in this case (using ORDER BY) the planner is trying to use the primary key index to scan all the rows and fetch the valid rows. But as it's evident using sort explicitly is much cheaper.

I've explored Postgres parameters like enable_indexscan and enable_seqscan which are by default on. We want to leave it on the database to decide to go for index scan or sequential scan. We've also tried tweaking effective_cache_size, random_page_cost and seq_page_cost. enable_sort is also on.

It's not happening only for this particular query but with a few more other queries where primary_key_index is being used rather than other effective methods possible.

P.S.:

Solution

After opening a case to AWS Support, this is what I got:

I understand that you want to know why you have degraded performance on your recently upgraded instance. This is the expected and general behavior of upgrade on a Postgres instance. Once upgrade is completed, you need to run ANALYZE on each user database to update statistics of the tables. This also makes SQLs performing better. A better way to do that is using vacuumdb[1], like this:

vacuumdb -U [your user] -d [your database] -Ze -h [your rds endpoint]

It will optmize your database execution plan only, not freeing space, but will take less time than a complete vacuum.

And this has resolved the issue. Hope this helps others who stumble upon such issue.