Search code examples
ruby-on-railspostgresqlherokufull-text-searchprefix

Correct Postgres full text search indexes


I'm creating a multi-column full text search index and currently I have this running

CREATE INDEX products_search_document ON products
USING gin(to_tsvector('english', style_number || ' ' || brand || ' ' || style_description || ' ' || color));

This works great for queries that I'm using like this

SELECT * FROM "products"
WHERE (to_tsvector('english', style_number||' '||brand||' '||style_description||' '||color)
      @@ to_tsquery('english', 'G2000'))

I'd like to use prefix matching now though so that my query would look like this:

SELECT * FROM "products"
WHERE (to_tsvector('english', style_number||' '||brand||' '||style_description||' '||color)
      @@ to_tsquery('english', 'G2000:*'))

and when I do run this on my Heroku postgres instance, I'm getting a Seq Scan on products instead of an Indexed scan.

What other index would I need to use the prefix matcher in Postgres?


Solution

  • Have you tried doing:

    set enable_seqscan=off; 
    

    and then running your query to see if it uses it. I don't see why it wouldn't. My suspicion is the planner thinks there is not enough specificity for that particular search so thinks a sequential scan is more efficient than a fulltext scan.

    That said, I think for prefix queries (where you don't won't stem equivalency to kick in e.g. postgraduate and postgres being considered equivalent) a btree text_pattern_ops, gist(gist_gtrgm_ops) or a gin index (I think spgist might be good but haven't done any metrics on that) on just the concatenated values or even (just on style_number ) if that is all you will be prefixing, would be more efficient than full text. Your query would not use tsvector, would just use

    style_number LIKE 'G5000%'

    style_number ILIKE 'G5000%'

    and your index would be just on style_number or concatenated values

    If you need case insensitivity then use gist(gist_trgm_ops) like covered here: http://www.postgresonline.com/journal/archives/212-PostgreSQL-9.1-Trigrams-teaching-LIKE-and-ILIKE-new-tricks.html