Search code examples
postgresqldictionaryfull-text-searchsynonym

Full Text search with multiple synonyms in PostgresSQL


I am implementing Full Text Search with PostgreSQL. I am using following type query to search in document column.

FROM schema.table t0
WHERE t0.document @@ websearch_to_tsquery('error')

I am working on to use FTS Dictionaries to search for similar words. I come across C:\Program Files\PostgreSQL\14\share\tsearch_data folder where I have defined word and its synonyms in xsyn_sample.rules file. File content is as mentioned below.

# Sample rules file for eXtended Synonym (xsyn) dictionary
# format is as follows:
#
# word synonym1 synonym2 ...
#
error fault issue mistake malfunctioning

I want to use this dictionary but don't know how to use it. When I search for 'error', I wants to display result for 'error', 'fault', 'issues', 'mistakes' etc which are having similar meanings. Kindly share if you have ever come across this implementation. Few things I am asking for

  1. Is this xsyn_sample.rules is sufficient for this? If not then what other techniques can be used for this type of search?
  2. How to configure postgreSQL 14 in my local system to use this dictionary instead of 'simple' or 'english'. I know how to use both of these dictionary with select plainto_tsquery('english','errors'); and select plainto_tsquery('simple','errors'); queries. Similarly I want to use my custom dictionary.
  3. Is there any better source for dictionaries use in postgres in compare to https://www.postgresql.org/docs/current/textsearch-dictionaries.html ?

Solution

  • Don't edit the example rules file, create your own file mysyn.rules and add the synonyms there. Then create a dictionary that uses the file:

    CREATE TEXT SEARCH DICTIONARY mysyn (TEMPLATE = xsyn_template, RULES = mysyn);
    

    Then copy the English text search configuration and add your dictionary:

    CREATE TEXT SEARCH CONFIGURATION myconf (COPY = english);
    ALTER TEXT SEARCH CONFIGURATION myconf
       ALTER MAPPING FOR word, asciiword WITH mysyn, english_stem;