lets say I have table posts
with this columns:
top_totle,title,sub_title,text
I need to have full text search on all this column and order by relevance where top_title need to be more important then title etc.
so i have 2 question which are the same, what is the best way of making an index for this and how to format the query to best support this index?
index options: I can create combined full text index on all of this column or create separate index for each of this column
which is the prefered way? option 1:
SELECT
title,
MATCH (top_title) AGAINST ('text' IN BOOLEAN MODE) as toptitle_score,
MATCH (title) AGAINST ('text' IN BOOLEAN MODE) as title_score,
MATCH (sub_text) AGAINST ('text' IN BOOLEAN MODE) as sub_text_score,
FROM
`posts`
WHERE
MATCH (top_title,title , sub_text ) AGAINST ('text' IN BOOLEAN MODE)
and `posts`.`deleted_at` IS NULL
AND `published_at` IS NOT NULL
Order by toptitle_score desc,
Order by title_score desc ,
Order by subtext_score desc
option 2:
SELECT
title,
MATCH (top_title) AGAINST ('text' IN BOOLEAN MODE) as toptitle_score,
MATCH (title) AGAINST ('text' IN BOOLEAN MODE) as title_score,
MATCH (sub_text) AGAINST ('text' IN BOOLEAN MODE) as sub_text_score,
FROM
`posts`
WHERE
(MATCH (top_title) AGAINST ('text' IN BOOLEAN MODE)
or MATCH (title) AGAINST ('text' IN BOOLEAN MODE)
or MATCH (sub_text) AGAINST ('text' IN BOOLEAN MODE))
and `posts`.`deleted_at` IS NULL
AND `published_at` IS NOT NULL
Order by toptitle_score desc,
Order by title_score desc ,
Order by subtext_score desc
option 3:
is there some smarter way?
Option 1 is good. It needs 4 FT indexes (one per column, plus one with all 3 columns). Don't repeat ORDER BY
:
ORDER BY toptitle_score DESC ,
title_score DESC ,
subtext_score DESC
Option 2 is not a viable contender. It needs only 3 indexes (not much savings) but is a lot slower due to OR
.
Option 3... (Option 1, as fixed, plus...)
The ORDER BY
you are using is probably "wrong" for what you want. For example, it will shove to the end of the list any rows without text
in toptitle
. Perhaps you want some "weighted" version:
ORDER BY
9 * top_title_score +
3 * title_score +
1 * sub_text_score DESC
(The 9,3,1 are rather arbitrary. It says that if 'text' shows up more than 3 times in title
, that is more important than showing up once in top_title
-- or something like that.)