Search code examples
postgresqlfull-text-searchstemming

Why Doesn't This Full Text Search Match in PostgreSQL?


I'm evaluating PostgreSQL to see if it is a viable alternative for ElasticSearch to begin with (migrating later is fine). I've been reading that PG full text capability is now 'good enough'. I'm running version 11.

Why doesn't this detect a match? I thought stemming would have easily detected different forms of the word "big":

SELECT to_tsvector('english', 'bigger') @@ to_tsquery('english', 'big')

Am I using the wrong configuration?


Solution

  • You can also reuse the scripts english.sh and english.sql from https://dba.stackexchange.com/questions/57058/how-do-i-use-an-ispell-dictionary-with-postgres-text-search.

    I have modified in the generated dictionaries: in english.affix I have added the IG > GER rule:

    flag *R:
        E           >       R               # As in skate > skater
        [^AEIOU]Y   >       -Y,IER          # As in multiply > multiplier
        [AEIOU]Y    >       ER              # As in convey > conveyer
        [^EY]       >       ER              # As in build > builder
        IG          >       GER             # For big > bigger
    

    in english.dict I have modified

    big/PY
    

    to

    big/PYR
    

    After running english.sql for the current database (you need to modify database name in the script):

    postgres=# select ts_debug('english bigger');
    select ts_debug('english bigger');
                                                  ts_debug                                              
    ----------------------------------------------------------------------------------------------------
     (asciiword,"Word, all ASCII",english,"{english_ispell,english_stem}",english_ispell,{english})
     (blank,"Space symbols"," ",{},,)
     (asciiword,"Word, all ASCII",bigger,"{english_ispell,english_stem}",english_ispell,"{bigger,big}")
    (3 rows)
    
    postgres=# SELECT to_tsvector('english bigger') @@ to_tsquery('english', 'big');
    SELECT to_tsvector('english bigger') @@ to_tsquery('english', 'big');
     ?column? 
    ----------
     t
    (1 row)