I'm creating a multi-column full text search index and currently I have this running
CREATE INDEX products_search_document ON products
USING gin(to_tsvector('english', style_number || ' ' || brand || ' ' || style_description || ' ' || color));
This works great for queries that I'm using like this
SELECT * FROM "products"
WHERE (to_tsvector('english', style_number||' '||brand||' '||style_description||' '||color)
@@ to_tsquery('english', 'G2000'))
I'd like to use prefix matching now though so that my query would look like this:
SELECT * FROM "products"
WHERE (to_tsvector('english', style_number||' '||brand||' '||style_description||' '||color)
@@ to_tsquery('english', 'G2000:*'))
and when I do run this on my Heroku postgres instance, I'm getting a Seq Scan on products
instead of an Indexed scan.
What other index would I need to use the prefix matcher in Postgres?
Have you tried doing:
set enable_seqscan=off;
and then running your query to see if it uses it. I don't see why it wouldn't. My suspicion is the planner thinks there is not enough specificity for that particular search so thinks a sequential scan is more efficient than a fulltext scan.
That said, I think for prefix queries (where you don't won't stem equivalency to kick in e.g. postgraduate and postgres being considered equivalent) a btree text_pattern_ops, gist(gist_gtrgm_ops) or a gin index (I think spgist might be good but haven't done any metrics on that) on just the concatenated values or even (just on style_number ) if that is all you will be prefixing, would be more efficient than full text. Your query would not use tsvector, would just use
style_number LIKE 'G5000%'
style_number ILIKE 'G5000%'
and your index would be just on style_number or concatenated values
If you need case insensitivity then use gist(gist_trgm_ops) like covered here: http://www.postgresonline.com/journal/archives/212-PostgreSQL-9.1-Trigrams-teaching-LIKE-and-ILIKE-new-tricks.html