Search code examples
postgresql

postgreSQL index not found why?


I use pgtrgm and I have an index like this:

CREATE INDEX index_users_on_username_trigram
ON users
USING gin ((LOWER(firstname) || LOWER(lastname)) gin_trgm_ops);

I want to display my data now

EXPLAIN 
SELECT * FROM 
cars_damage
LEFT JOIN users u
ON u.id = cars_damage.user_id
WHERE u.firstname % 'Name'

but the index is not found on u.firstname, why?

Nested Loop  (cost=0.14..7.60 rows=1 width=505)                                  
  ->  Seq Scan on cars_damage  (cost=0.00..3.01 rows=1 width=273)                
  ->  Index Scan using users_pkey on users u  (cost=0.14..2.37 rows=1 width=232) 
        Index Cond: (id = cars_damage.user_id)                                   
        Filter: ((firstname)::text % 'name'::text)     

Solution

  • Replace WHERE u.firstname % 'Name' with a (LOWER(firstname) || LOWER(lastname)) % 'Name' and it should work fine - after all, you didn't index the firstname but a custom expression that only includes it.

    To Postgres, the fact that you lowercased it and concatenated to something is as abstract as passing it to md5() - what's in the index no longer directly translates to what's in the column. You need to follow the expression if you need to use the expression index.
    demo at db<>fiddle

    EXPLAIN 
    SELECT * FROM 
    cars_damage
    LEFT JOIN users u
    ON u.id = cars_damage.user_id
    WHERE (LOWER(u.firstname) || LOWER(u.lastname)) % 'Name'
    
    QUERY PLAN
    Hash Join (cost=101.79..1156.29 rows=5 width=107)

      Hash Cond: ((cars_damage.user_id)::numeric = u.id)

      -> Seq Scan on cars_damage (cost=0.00..917.00 rows=50000 width=37)

      -> Hash (cost=101.66..101.66 rows=10 width=70)

            -> Bitmap Heap Scan on users u (cost=64.08..101.66 rows=10 width=70)

                  Recheck Cond: ((lower(firstname) || lower(lastname)) % 'Name'::text)

                  -> Bitmap Index Scan on index_users_on_username_trigram (cost=0.00..64.08 rows=10 width=0)

                        Index Cond: ((lower(firstname) || lower(lastname)) % 'Name'::text)

    Your original query on the same test setup:

    EXPLAIN 
    SELECT * FROM 
    cars_damage
    LEFT JOIN users u
    ON u.id = cars_damage.user_id
    WHERE u.firstname % 'Name'
    
    QUERY PLAN
    Hash Join (cost=2566.12..3620.62 rows=5 width=107)

      Hash Cond: ((cars_damage.user_id)::numeric = u.id)

      -> Seq Scan on cars_damage (cost=0.00..917.00 rows=50000 width=37)

      -> Hash (cost=2566.00..2566.00 rows=10 width=70)

            -> Seq Scan on users u (cost=0.00..2566.00 rows=10 width=70)

                  Filter: (firstname % 'Name'::text)

    Note that pg_trgm comparisons are case-insensitive so the whole lowercasing thing isn't really necessary. Quoting the doc:

    The pg_trgm module provides GiST and GIN index operator classes that allow you to create an index over a text column for the purpose of very fast similarity searches. These index types support the above-described similarity operators, and additionally support trigram-based index searches for LIKE, ILIKE, ~, ~* and = queries. The similarity comparisons are case-insensitive in a default build of pg_trgm.

    You also don't need to lowercase the search phrase on the right: you showed a query with initcapped 'Name' but a plan with a lowercased 'name'. This doesn't happen on its own, but there's no need for either you or the db to make that happen because it doesn't change anything.

    DROP INDEX IF EXISTS index_users_on_username_trigram;
    
    CREATE INDEX IF NOT EXISTS index_users_on_username_trigram
    ON users
    USING gin ((firstname||lastname) gin_trgm_ops);
    
    vacuum analyze users;
    
    EXPLAIN 
    SELECT * FROM 
    cars_damage
    LEFT JOIN users u
    ON u.id = cars_damage.user_id
    WHERE (u.firstname||u.lastname) % 'Name'
    
    QUERY PLAN
    Hash Join (cost=101.74..1156.24 rows=5 width=107)

      Hash Cond: ((cars_damage.user_id)::numeric = u.id)

      -> Seq Scan on cars_damage (cost=0.00..917.00 rows=50000 width=37)

      -> Hash (cost=101.61..101.61 rows=10 width=70)

            -> Bitmap Heap Scan on users u (cost=64.08..101.61 rows=10 width=70)

                  Recheck Cond: ((firstname || lastname) % 'Name'::text)

                  -> Bitmap Index Scan on index_users_on_username_trigram (cost=0.00..64.08 rows=10 width=0)

                        Index Cond: ((firstname || lastname) % 'Name'::text)