Search code examples
mysqlinnodbfull-text-indexing

How MySQL will use FT index when searching is performed with an asterisk (wildcard) operator?


Basically, I have huge table (~30m records) that have a fulltext index at one of its columns.

The search query looks like this:

... WHERE MATCH(body) AGAINST('+Hello +my*' IN BOOLEAN MODE) ...

My storage engine is InnoDB, so we have some limitations:

  • The minimum word length is 3 characters.

However, the documentation says this:

If a word is specified with the truncation operator, it is not stripped from a boolean query, even if it is too short (as determined from the ft_min_word_len setting) or a stopword. This occurs because the word is not seen as too short or a stopword, but as a prefix that must be present in the document in the form of a word that begins with the prefix. Suppose that ft_min_word_len=4.

The question is: How MySQL will use FT index in such cases? The word bae shouldn't be present in any index as it doesn't meet the requirement for minimum word length. Maybe queries like this will be a bit slower?


Solution

  • The query may be slightly slower but not as a result of your reasoning.

    The min word length setting is also used when the index is created, so innodb will not index words that are shorter than the minimum word length.

    During the execution of a full text search, innodb again checks the length of words being searched against the min word length limit and eliminates words that are shorter than the limit because they cannot be found in the index. So, if you had a search criteria of 'my' (note the absence of the asterix), this would be ignored by innodb.

    However, when you use the wildcard operator with a number of charaters that are less than the limit (such as your my*, which has only two charaters), these are still included in the search because innodb will look at the pattern, not just the word.

    Obviously, checking for complete match only is faster than checking for complete match and the beginning of the word, but there will not be significant difference in speed.