Search code examples
ruby-on-railssphinxwildcardthinking-sphinxrails-3.1

Thinking Sphinx - Wildcard search only on specific indexes


I am running thinking-sphinx 2.0.10 on my Rails 3.1 environemt and indexing and searching works pretty well. When im searching on my User model, I want to do a wildcard search on the name of the user but not on the email so a user should only be returned if the given search string exactly matches the email of a user. I did some research and found out that this could be done by enabling the wildcard search within the define_index block with

set_property :enable_star => true
set_property :min_infix_len => 1

and adding :infixes => true to the indexes which should support wildcard search

define_index do

  indexes "CONCAT(first_name, ' ', last_name)", :as => :user_name,  :infixes => true
  indexes email

    has :id, :as => :user_id
    
    set_property :enable_star => true
    set_property :min_infix_len => 1
end

This is from the autogenerated development.sphinx.conf

index user_core
{
  source = user_core_0
  path = /../../../../db/sphinx/development/user_core
  charset_type = utf-8
  min_infix_len = 1
  infix_fields = user_name
  enable_star = 1
}

The infix_fields are declared correctly I guess.

The problem is that if I search for ".com" I still get all the Users with .com email address. What can be the reason for that?

Thanks for your help!


Solution

  • Sphinx indexes words in the input. Usually whole words only, but can enable part words with infix/prefix.

    Anyway, anything not defined in charset_table (you dont have one so taking the default), is a seperator.

    So "." is a separator. So your email will be indexed as words. "com" is a word in itself.

    So searching for ".com" is just searching for "com" - ie still a whole-word match.

    You could just add "." to your charset table to solve this. (but beware if it is used as a seperator in your data. if just indexing name and email, might not be an issue). Might want to add @ too, so that the whole email can be indexed as one 'word'.


    I dont know how to put it your ruby config, I just know it needs to end up in the sphinx conf file.