Search code examples
solr

Range query does not work on TextFields in Solr


I defined a dynamic TextField in Solr:

<dynamicField name="*" type="text" indexed="true" stored="true" omitTermFreqAndPositions="true" omitNorms="true" docValues="false" multiValued="false"/>
<fieldType name="text" class="solr.TextField"/>

If I make a range query on such fields, every document in index is found, even those that aren't in the range. DebugQuery shows that the range query is internally changed to "{* TO *}":

  "debug":{
    "rawquerystring":"Format.Typ:[X TO Y]",
    "querystring":"Format.Typ:[X TO Y]",
    "parsedquery":"SolrRangeQuery(Format.Typ:{* TO *})",
    "parsedquery_toString":"Format.Typ:{* TO *}",
    "explain":{
      "$UTILMD&5.2c&3ced6800-1262-11ed-a3af-00155d639d26":"\n1.0 = Format.Typ:{* TO *}\n"},
    ...

Solr documentation states that the range search should work with all types of fields: https://solr.apache.org/guide/8_11/the-standard-query-parser.html#range-searches

And it works fine if I define the fields as StrField. Why is the query changing and what can I do to make it work with TextFields?


Solution

  • You need to define an analyzer in the fieldType definition.

    An analyzer examines the text of fields and generates a token stream.

    Note that the analysis chain operates both on text being indexed and text being queried. You can define one analyzer for both, or one specific for each.

    Although there exists a default analyzer fallback, it won't allow to do much things. More precisely, a tokenizer should be defined for a text field to work as "expected", otherwise there is no point in using a textField at all.

    Even the Keyword tokenizer, which just treats the entire stream as a single token, should enable range queries to work properly :

    <fieldType name="text" class="solr.TextField">
      <analyzer class="org.apache.lucene.analysis.core.KeywordAnalyzer"/>
    </fieldType>
    

    Nb. The above is equivalent to :

    <fieldType name="text" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
      </analyzer>
    </fieldType>
    

    Once the changes are applied, you will have to delete the content of all "text" fields and reload the core before being able to index and query these fields again.