Search code examples
search-enginesphinxwildcard

Sphinx (via SphinxQL) match without asterisk, but not with asterisk


I have an index in Sphinx, one of the words in this index is an article number. In this case 04.007.00964. When I query my index like this:

SELECT * FROM myIndex WHERE MATCH('04.007.00964')

I have one result, this is as expected. However, when I query it like this:

SELECT * FROM myIndex WHERE MATCH('*04.007.00964*')

I have zero results.

My index configuration is:

index myIndex
{
    source          = myIndex
    path            = D:\Tools\Sphinx\data\myIndex
    morphology      = none
    min_word_len    = 3
    min_prefix_len  = 0
    min_infix_len   = 2
    enable_star     = 1
}

I'm using v2.0.4-release

What am I doing wrong, or what dont I understand?


Solution

  • Because of

    min_word_len    = 3
    

    The first query will be effectivly:

    SELECT * FROM myIndex WHERE MATCH('007 00964')
    

    So short words are ignored. (indexing and querying)

    Edit to add: And "." is not in the default charset_table, which is why its used as a seperator.

    However "*04" is not stripped, because it 3 chars,

    but then there is nothing to match, because "04" will not be in the index (its shorter than the min_word_len)

    ... so its a unfortunate combination of word and infix lengths. Can easily fix it by making min_word_len = 2

    Edit to add: or adding '.' to charset tables, so that its no longer used to separate words, therefore the whole article number is used - and is longer than both min_word_len and min_infix_len)