Search code examples
postgresqldatabase-administration

Different Indexes are used for same query with different companyid's


We are running the following query on our RDS instance and it is taking a huge amount of time, and for different companyId it executes with in a second.

Query:

select *
from items_1_primary
WHERE items_1_primary.item_type_id IN (1,2)
  AND items_1_primary.hidden IS NULL
  AND items_1_primary.company_id = 9130347227057236
ORDER BY items_1_primary.id
LIMIT 1 OFFSET 0;

FYI This is the query plan for two different companies

inventorymigrationprod=> explain analyze select * from items_1_primary WHERE items_1_primary.item_type_id IN (1,2) AND items_1_primary.hidden IS NULL AND items_1_primary.company_id =9130347227057236 ORDER BY items_1_primary.id LIMIT 1 OFFSET 0;

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.57..1894.58 rows=1 width=332) (actual time=372100.718..372100.719 rows=1 loops=1)
  -> Index Scan using items_1_primary_pkey on items_1_primary (cost=0.57..175867668.27 rows=92855 width=332) (actual time=372100.717..372100.717 rows=1 loops=1)
     Filter: ((hidden IS NULL) AND (item_type_id = ANY ('{1,2}'::numeric[])) AND (company_id = '9130347227057236'::numeric))
     Rows Removed by Filter: 535927031
Planning Time: 1.626 ms
Execution Time: 372100.745 ms
(6 rows)
inventorymigrationprod=> explain analyze select * from items_1_primary WHERE items_1_primary.item_type_id IN (1,2) AND items_1_primary.hidden IS NULL AND items_1_primary.company_id =9130348260181756 ORDER BY items_1_primary.id LIMIT 1 OFFSET 0;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10659.77..10659.77 rows=1 width=332) (actual time=9.559..9.560 rows=1 loops=1)
  -> Sort (cost=10659.77..10665.38 rows=2242 width=332) (actual time=9.557..9.558 rows=1 loops=1)
     Sort Key: id
     Sort Method: top-N heapsort Memory: 25kB
     -> Index Scan using items_1_primary_company_id_item_type_id on items_1_primary (cost=0.57..10648.56 rows=2242 width=332) (actual time=0.057..6.117 rows=10823 loops=1)
        Index Cond: ((company_id = '9130348260181756'::numeric) AND (item_type_id = ANY ('{1,2}'::numeric[])))
        Filter: (hidden IS NULL)
Planning Time: 0.138 ms
Execution Time: 9.589 ms
(9 rows)

DB Instance(s): inventory-migration-prod-cluster

Indexes On Table
--------------------------
indexes:
“items_1_primary_pkey” PRIMARY KEY, btree (id)
 “items_1_primary_company_id_item_type_id” btree (company_id, item_type_id)
 “items_1_primary_uniqueness” UNIQUE CONSTRAINT, btree (company_id, item_id)

Solution

  • Either PostgreSQL over-estimates the number of rows that satisfy the condition, or they are distributed in an unfavorable way within the table.

    If the first is the case, and ANALYZE (perhaps with increased default_statistics_target) could solve the problem.

    But whatever causes the problem, you can avoid it by changing the ORDER BY clause so that PostgreSQL cannot use an index:

    ORDER BY items_1_primary.id + 0