Task: Implement full text search in multiple tables in PostgreSQL.
For example, a table of projects - projects: There are two fields in which need to search: title and description. Create an index on this table:
ALTER TABLE public.projects ADD COLUMN search_fts tsvector;
UPDATE public.projects SET search_fts =
to_tsvector(
coalesce(title, '') ||' '||
coalesce(description, '') || ' '
);
CREATE INDEX in_projects_idx ON public.projects
USING gin (search_fts);
Now search is simple:
SELECT
DISTINCT p.id,
p.title,
ts_rank(
p.search_fts,
to_tsquery('word1 | word2')
) as rank
FROM
projects p
WHERE
p.search_fts @@ to_tsquery('word1 | word2')
ORDER BY rank DESC;
Sophisticate. Now we need to look at the table and project categories - project_categories. Similar to create tsvector and for the table on the field name. Search query now is:
SELECT
DISTINCT p.id,
p.title,
category.name as categoryName,
ts_rank(
(
coalesce(p.search_fts, '') ||' '||
coalesce(category.search_fts, '')
),
to_tsquery('word1 | word2 | categoryName')
) as rank
FROM
projects p
LEFT JOIN project_categories category
ON p.category_id = category.category_id
WHERE
(
coalesce(p.search_fts, '') ||' '||
coalesce(category.search_fts, '')
) @@ to_tsquery('word1 | word2 | categoryName')
ORDER BY rank DESC;
More sophisticate. Now the search has to work on dozens of tables related to projects as a one-to-many and many-to-many.
Number join's growing. Combine to a growing number of fields.
Actually the question the right way I'm going? How would you solve this problem?
I also think about the version with view. For example, built on such a request:
(
SELECT
p.id as project_id,
p.search_fts
FROM projects p
) UNION ALL (
SELECT
p.id as project_id,
category.search_fts
FROM projects p
JOIN project_categories category
ON p.category_id = category.category_id
)
Finally, we obtain the overall index. By which to search. But in the view have to combine a dozen requests for various communication options. Interested in your views on this option.
Sorry for my english.
You have a couple of options:
Use triggers on all the input tables to maintain a summary table that combines all the records of interest in text rows. This works OK for simple-ish cases like a customer
that has 1 or more phone
and one or more address
entries, which you can put in the summary search table as |customername|customerid|phone1|phone2|phone3|...|address1|address2|...
using simple concatenation and string_agg
over joined tables. It scales poorly to deeply nested relationships, though, and can impose some locking and concurrency problems.
Use an external search system like Apache Solr that's designed to solve complex search problems.