I use pgtrgm and I have an index like this:
CREATE INDEX index_users_on_username_trigram
ON users
USING gin ((LOWER(firstname) || LOWER(lastname)) gin_trgm_ops);
I want to display my data now
EXPLAIN
SELECT * FROM
cars_damage
LEFT JOIN users u
ON u.id = cars_damage.user_id
WHERE u.firstname % 'Name'
but the index is not found on u.firstname, why?
Nested Loop (cost=0.14..7.60 rows=1 width=505)
-> Seq Scan on cars_damage (cost=0.00..3.01 rows=1 width=273)
-> Index Scan using users_pkey on users u (cost=0.14..2.37 rows=1 width=232)
Index Cond: (id = cars_damage.user_id)
Filter: ((firstname)::text % 'name'::text)
Replace WHERE u.firstname % 'Name'
with a (LOWER(firstname) || LOWER(lastname)) % 'Name'
and it should work fine - after all, you didn't index the firstname
but a custom expression that only includes it.
To Postgres, the fact that you lowercased it and concatenated to something is as abstract as passing it to md5()
- what's in the index no longer directly translates to what's in the column. You need to follow the expression if you need to use the expression index.
demo at db<>fiddle
EXPLAIN
SELECT * FROM
cars_damage
LEFT JOIN users u
ON u.id = cars_damage.user_id
WHERE (LOWER(u.firstname) || LOWER(u.lastname)) % 'Name'
QUERY PLAN |
---|
Hash Join (cost=101.79..1156.29 rows=5 width=107) |
Hash Cond: ((cars_damage.user_id)::numeric = u.id) |
-> Seq Scan on cars_damage (cost=0.00..917.00 rows=50000 width=37) |
-> Hash (cost=101.66..101.66 rows=10 width=70) |
-> Bitmap Heap Scan on users u (cost=64.08..101.66 rows=10 width=70) |
Recheck Cond: ((lower(firstname) || lower(lastname)) % 'Name'::text) |
-> Bitmap Index Scan on index_users_on_username_trigram (cost=0.00..64.08 rows=10 width=0) |
Index Cond: ((lower(firstname) || lower(lastname)) % 'Name'::text) |
Your original query on the same test setup:
EXPLAIN
SELECT * FROM
cars_damage
LEFT JOIN users u
ON u.id = cars_damage.user_id
WHERE u.firstname % 'Name'
QUERY PLAN |
---|
Hash Join (cost=2566.12..3620.62 rows=5 width=107) |
Hash Cond: ((cars_damage.user_id)::numeric = u.id) |
-> Seq Scan on cars_damage (cost=0.00..917.00 rows=50000 width=37) |
-> Hash (cost=2566.00..2566.00 rows=10 width=70) |
-> Seq Scan on users u (cost=0.00..2566.00 rows=10 width=70) |
Filter: (firstname % 'Name'::text) |
Note that pg_trgm
comparisons are case-insensitive so the whole lowercasing thing isn't really necessary. Quoting the doc:
The
pg_trgm
module provides GiST and GIN index operator classes that allow you to create an index over a text column for the purpose of very fast similarity searches. These index types support the above-described similarity operators, and additionally support trigram-based index searches forLIKE
,ILIKE
,~
,~*
and=
queries. The similarity comparisons are case-insensitive in a default build ofpg_trgm
.
You also don't need to lowercase the search phrase on the right: you showed a query with initcapped 'Name'
but a plan with a lowercased 'name'
. This doesn't happen on its own, but there's no need for either you or the db to make that happen because it doesn't change anything.
DROP INDEX IF EXISTS index_users_on_username_trigram;
CREATE INDEX IF NOT EXISTS index_users_on_username_trigram
ON users
USING gin ((firstname||lastname) gin_trgm_ops);
vacuum analyze users;
EXPLAIN
SELECT * FROM
cars_damage
LEFT JOIN users u
ON u.id = cars_damage.user_id
WHERE (u.firstname||u.lastname) % 'Name'
QUERY PLAN |
---|
Hash Join (cost=101.74..1156.24 rows=5 width=107) |
Hash Cond: ((cars_damage.user_id)::numeric = u.id) |
-> Seq Scan on cars_damage (cost=0.00..917.00 rows=50000 width=37) |
-> Hash (cost=101.61..101.61 rows=10 width=70) |
-> Bitmap Heap Scan on users u (cost=64.08..101.61 rows=10 width=70) |
Recheck Cond: ((firstname || lastname) % 'Name'::text) |
-> Bitmap Index Scan on index_users_on_username_trigram (cost=0.00..64.08 rows=10 width=0) |
Index Cond: ((firstname || lastname) % 'Name'::text) |