Search code examples
postgresqlindexingpostgresql-14

Postgresql gin index miss when the table has three or more fields


the sql

CREATE TABLE user_test1 (
    id bigserial PRIMARY KEY,
    name text NOT NULL,
    created_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX idx_user_name_test1 ON user_test1 using gin (name gin_trgm_ops);
explain analyse
select name from user_test1 where name like '%123456%';

got

Seq Scan on user_test1  (cost=0.00..23.38 rows=1 width=32) (actual time=0.006..0.007 rows=0 loops=1)
   Filter: (name ~~ '%123456%'::text)
 Planning Time: 0.110 ms
 Execution Time: 0.030 ms
(4 rows)

but when without create_at

CREATE TABLE user_test2 (
    id bigserial PRIMARY KEY,
    name text NOT NULL
);
CREATE INDEX idx_user_name_test2 ON user_test2 using gin (name gin_trgm_ops);
explain analyse
select name from user_test2 where name like '%123456%';

got

Bitmap Heap Scan on user_test2  (cost=20.00..24.02 rows=1 width=32)
   Recheck Cond: (name ~~ '%123456%'::text)
   ->  Bitmap Index Scan on idx_user_name_test2  (cost=0.00..20.00 rows=1 width=0)
         Index Cond: (name ~~ '%123456%'::text)
(4 rows)

Postgre version:

PostgreSQL 14.2 (Debian 14.2-1.pgdg110+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit

env:
macos 13.1
docker postgres:14.2
IMAGE ID: 8b547b8bf0d7

Dropping these tables and rebuilding has no effect. I searched a lot of information, but did not find similar examples. I tried to submit issues to postgresql's GitHub repository, but it didn't open.


Solution

  • Your assumption is flawed because you're testing index performance in an empty table. Keep in mind that the effectiveness of an index depends on many factors, like amount of data and data distribution.

    With some data the planner will most certainly take a different direction:

    INSERT INTO user_test1 (name, created_at) 
    SELECT floor(random()*1000000+1)::int, ts 
    FROM generate_series('1970-01-01 19:00:00+00'::timestamptz, 
                         '2022-11-19 19:50:00+00'::timestamptz, '1 hour') AS ts;
    
    EXPLAIN (ANALYSE,BUFFERS)
    SELECT name FROM user_test1 
    WHERE name LIKE '%12345%';
                                                               QUERY PLAN                                                     
          
    --------------------------------------------------------------------------------------------------------------------------
    ------
     Bitmap Heap Scan on user_test1  (cost=707.92..2025.10 rows=506 width=32) (actual time=5.072..5.078 rows=7 loops=1)
       Recheck Cond: (name ~~ '%12345%'::text)
       Heap Blocks: exact=7
       Buffers: shared hit=186
       ->  Bitmap Index Scan on idx_user_name_test1  (cost=0.00..707.79 rows=506 width=0) (actual time=5.065..5.066 rows=7 loo
    ps=1)
             Index Cond: (name ~~ '%12345%'::text)
             Buffers: shared hit=179
     Planning:
       Buffers: shared hit=31
     Planning Time: 0.516 ms
     Execution Time: 5.117 ms
    (11 rows)
    

    Demo: db<>fiddle