postgresql query-optimization postgresql-performance

Postgres query making high disk IO after upgrade to version 11

Postgres query started talking high read IOPS and CPU after upgrade from RDS version 9.6 to RDS version 11. The data set is same as it was before upgrade. Not sure what is the issue.

Below is the explain plan:

Could it be because index got corrupted?

Explain (analyze true, verbose true, costs true, buffers true, timing true )
select consumertr0_.ref_id as col_0_0_
from consumer_transactions consumertr0_
where (consumertr0_.remaining_amount is not null)
  and (consumertr0_.expiry_time is not null)
  and consumertr0_.expiry_time>'2020-12-15T00:00:00'
  and consumertr0_.expiry_time<now()
  and consumertr0_.remaining_amount>0
order by consumertr0_.expiry_time asc
limit 20000;

                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..80391.67 rows=20000 width=24) (actual time=191716.213..192077.229 rows=20000 loops=1)
   Output: ref_id, expiry_time
   Buffers: shared hit=9481343 read=1566736
   I/O Timings: read=97.486
   ->  Index Scan using consumer_transactions_expiry_time_remaining_amount on public.consumer_transactions consumertr0_  (cost=0.57..1109723.40 rows=276081 width=24) (actual time=191716.211..192075.241 rows=20000 loops=1)
         Output: ref_id, expiry_time
         Index Cond: ((consumertr0_.expiry_time > '2020-12-15 00:00:00'::timestamp without time zone) AND (consumertr0_.expiry_time < now()))
         Buffers: shared hit=9481343 read=1566736
         I/O Timings: read=97.486
 Planning Time: 1.525 ms
 Execution Time: 192078.720 ms
(11 rows)

index definition:

"consumer_transactions_expiry_time_remaining_amount" btree
   (expiry_time, remaining_amount)
WHERE expiry_time IS NOT NULL
  AND remaining_amount IS NOT NULL
  AND remaining_amount > 0::numeric

Analyze details:

        relname        |         last_analyze          |       last_autoanalyze      
 consumer_transactions | 2021-01-24 22:00:03.144379+00 |

The same number of records had been processed earlier very quickly with low IOPS requirement. Though I do not have explain plan of previous version 9.6.

Solution: I created same index with different name and it solved the issue. I ll drop old index once I identifies why suddenly old index became so slow right after upgrade.

Explain plan with new index:

explain (analyze true, verbose true, costs true, buffers true, timing true )  select consumertr0_.ref_id as col_0_0_ from consumer_transactions consumertr0_ where (consumertr0_.remaining_amount is not null) and (consumertr0_.expiry_time is not null) and consumertr0_.expiry_time>'2019-07-01T00:00:00' and consumertr0_.expiry_time<now() and consumertr0_.remaining_amount>0 order by consumertr0_.expiry_time asc limit 20000;

            
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..73592.06 rows=20000 width=24) (actual time=0.048..18.307 rows=20000 loops=1)
   Output: ref_id, expiry_time
   Buffers: shared hit=11140
   ->  Index Scan using consumer_transactions_expiry_time_remaining_amount2 on public.consumer_transactions consumertr0_  (cost=0.57..22273478.26 rows=6053275 width=24) (actual time=0.047..16.119 rows=20000 loops=1)
         Output: ref_id, expiry_time
         Index Cond: ((consumertr0_.expiry_time > '2019-07-01 00:00:00'::timestamp without time zone) AND (consumertr0_.expiry_time < now()))
         Buffers: shared hit=11140
 Planning Time: 1.160 ms
 Execution Time: 19.600 ms
(9 rows)

(END)

also older explain plan directly starts from actual time 191716.211 while new one starts with 0.047. I do not understand where is actual time spent before 191716.211.

FYI: index bloat details:

 current_database | schemaname |        tblname        |                       idxname                       |  real_size  | extra_size  |    extra_ratio    | fillfactor | bloat_size  |    bloat_ratio    
| is_na 
------------------+------------+-----------------------+-----------------------------------------------------+-------------+-------------+-------------------+------------+-------------+-------------------
+-------
| f
 proddb            | public     | consumer_transactions | consumer_transactions_expiry_time_remaining_amount  |  4748820480 |  3698360320 |  77.8795563145819 |         90 |  3583516672 |  75.4611947765185 | f
 proddb            | public     | consumer_transactions | consumer_transactions_expiry_time_remaining_amount2 |  1755013120 |   704552960 |  40.1451676896866 |         90 |   589709312 |  33.6014190024973 | f

Solution

The old index was quite bloated: scanning it had to look at 11048079 8kB blocks (and read 1566736 of them from disk) to find the matching rows, while the new index had to look at only 11140 blocks.

I am not sure how the index got into this state.

The second index column seems pretty useless.

The perfect index for this query would be:

CREATE INDEX ON public.consumer_transactions (expiry_time) INCLUDE (ref_id)
WHERE remaining_amount > 0;

If you VACUUM the table, you would get a fast index-only scan.