Search code examples
sqlpostgresqlsyntaxfull-text-search

magical typo corrector in postgres?


I have some trouble understanding the PostgreSQL documentation. Apparently, the example is text searching for words that are similar(?) to 'caterpiler'. Example is taken from documentation 41.3. Relevant queries are replicated below. The materialized views parts are not central to the question.

CREATE EXTENSION file_fdw;
CREATE SERVER local_file FOREIGN DATA WRAPPER file_fdw;
CREATE FOREIGN TABLE words (word text NOT NULL)
  SERVER local_file
  OPTIONS (filename '/usr/share/dict/words');
-- previously established that foreign table "words"
-- contains zero counts of 'caterpiler'
SELECT word FROM words ORDER BY word <-> 'caterpiler' LIMIT 10;
-- output
/*
 word
---------------
 cater
 caterpillar
 Caterpillar
 caterpillars
 caterpillar's
 Caterpillar's
 caterer
 caterer's
 caters
 catered
(10 rows)
*/

EXPLAIN ANALYZE

 Limit  (cost=11583.61..11583.64 rows=10 width=32) (actual time=1431.591..1431.594 rows=10 loops=1)
   ->  Sort  (cost=11583.61..11804.76 rows=88459 width=32) (actual time=1431.589..1431.591 rows=10 loops=1)
         Sort Key: ((word <-> 'caterpiler'::text))
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Foreign Scan on words  (cost=0.00..9672.05 rows=88459 width=32) (actual time=0.057..1286.455 rows=479829 loops=1)
               Foreign File: /usr/share/dict/words
               Foreign File Size: 4953699
 Planning time: 0.128 ms
 Execution time: 1431.679 ms

I assume that word <-> 'catepiler' is using the tsquery <-> tsquery syntax but how does that work in this query and how is it producing the output order? Also is the 'caterpiler'::text in the EXPLAIN ANALYZE just the planner treating all strings as text by default, or does that mean some other definition of <-> is being used?

Furthermore, is a dictionary and/or thesaurus being looked up, or is this a general feature of <->? The file path sure looks like a dictionary but it does not appear to be used as such.

Thanks


Solution

  • I assume that word <-> 'catepiler' is using the tsquery <-> tsquery syntax

    No, it is using pg_trgm's text <-> text operator. If pg_trgm doesn't exist, then you get an error.

    Also is the 'caterpiler'::text in the EXPLAIN ANALYZE just the planner treating all strings as text by default, or does that mean some other definition of <-> is being used?

    The latter, a different definition is being used.

    The file path sure looks like a dictionary but it does not appear to be used as such.

    It is list of words, that is not really a dictionary in everyday meaning but nonetheless that is what the file is called. Note that the example is just using a file which often exists on UNIX-like systems. Neither the file, nor its choice of name, are part of PostgreSQL. It is just a handy thing to use in an example.