I have Sunspot/Solr deployed on a site for searching. I want partial searching on Part Numbers with a hyphen down to two characters.
My current versions are:
Solr & Lucense 3.5
sunspot (2.0.0)
sunspot_rails (2.0.0)
sunspot_solr (2.0.0)
My config file:
<fieldType name="n_gram_text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="15"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1"/>
</analyzer>
</fieldType>
<dynamicField name="*_ngram" stored="false" type="n_gram_text" multiValued="true" indexed="true"/>
Sample Part Number:
455880-1
So if I use the SOLR Admin console and search for "-1" I get results back. In the analyzer I confirmed that "-1" was a gram and that the search query "-1" matches it and "0-1".
But when I perform this search in my website it fails. If I search "0-1" I'll get results but if it's just "-1" it give me nothing. I've tried escaping it with "-1" but that doesn't change the outcome.
What else could I troubleshoot between the working Solr and not working Sunspot?
In my Rails logs I have the following:
SOLR Request (7.5ms) [ path=#<RSolr::Client:0x000001081efe70>
parameters={data:
fq=type%3AGroup
&fq=is_site_b%3Atrue
&q=-1
&fl=%2A+score
&qf=name_text+display_ngram
&defType=dismax
&start=0
&rows=20,
method: post,
params: {:wt=>:ruby},
query: wt=ruby,
headers: {"Content-Type"=>"application/x-www-form-urlencoded; charset=UTF-8"}, path: select, uri: http://localhost:8984/solr/select?wt=ruby, open_timeout: , read_timeout: , retry_503: , retry_after_limit: } ]
Model Setup:
searchable do
text :name
text :item_display_part_numbers, :as => :display_ngram
end
def item_display_part_numbers
self.items.map(&:display_part_number)
end
item_display_part_numbers is an array of part numbers. The patterns are either digits, digits with a -1, or the text "n/a".
Search :
@search = Sunspot.search(Group) do
fulltext params[:search_string]
paginate(:page => params[:page], :per_page => params[:per_page] || 20)
end
I believe this data object is indexed correctly. In the console if I retrieve it and call it's Index method I get the following:
<?xml version="1.0" encoding="UTF-8"?>
<add>
<doc>
<field name="id">Group 1365</field>
<field name="type">Group</field>
<field name="type">ActiveRecord::Base</field>
<field name="class_name">Group</field>
<field name="name_text">HEAVY DUTY BRASS BELL</field>
<field name="display_ngram">n/a</field>
<field name="display_ngram">455880-1</field>
<field name="display_ngram">n/a</field>
</doc>
</add>
You are using the DismaxQueryParser. If a query starts with a hyphen, the following text is interpreted as prohibited (http://wiki.apache.org/solr/DisMaxQParserPlugin#Query_Syntax).
DismaxQueryParser supports phrase searching.
A solution could be to adjust the solr params:
@search = Sunspot.search(Group) do
adjust_solr_params do |params|
params[:q] = "\"#{params[:q]}\"" if params[:q].start_with?("-")
end
fulltext params[:search_string]
paginate(:page => params[:page], :per_page => params[:per_page] || 20)
end
It looks like a dirty hack - but it works...