There are two fields in my schema:
is using keyword
tokenizer filter that preserves the tokens as it is (not even dividing on space. I double checked that in analysis tab.)
is using WhitespaceTokenizerFactory
that breaks the words on spaces and tabs etc.
<field name="field1" type="field1_type" indexed="true" stored="false"/>
<field name="field2" type="field2_type" indexed="true" stored="false"/>
<fieldType name="field2_type" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> </analyzer> </fieldType>
I am using edismax
parser with default qf
value= field1
Now when I'm querying with q=hello world
In deugging mode its showing that its making query like
rawquerystring:hello world
querystring:hello world parsedquery:(+((DisjunctionMaxQuery((field1:hello | field2:hello)) DisjunctionMaxQuery((field1:world | field2:world)))~1) ())/no_coord
parsedquery_toString:+(((field1:hello | field2:hello) (field1:world | field2:world))~1) ()
What I expected was something like this:
expected:+(((field1:hello world) ((field2:hello) (field2:world))~1) ()
i.e. for field1
it should not break the query on space as it is using keyword tokenizer while it should break the query on space for field2
Can you please tell what am I doing wrong?
You need to escape the space in your query (using backslash or quotes around the term) - the query parser doesn't parse based on the analyzer/tokenizer for each field.