Search code examples
solr

Solr not returning the exact element


Using Solr 7.7.3 I have an element with the label:"alpha-ravi" and when I search in solr label:"alpha" its returning the element with the label "alpha-ravi" when looking at the solr doc, it should not return this element. can anyone explain why this behavior ? enter image description here


Solution

  • If you want to retrieve the exact results (i.e return docs with "alpha-ravi" only if the user types the exact "alpha-ravi" in the search), then I would suggest you could go with the Keyword tokenizer (solr.KeywordTokenizerFactory). This tokenizer would treat the entire "alpha-ravi" as a single token and thus, will not return partial results if there's a match for "alpha" or "ravi".

    For example: in your schema.xml file you should add something like (configure the various filter chains as per your need)

     <fieldType name="single_token_string" class="solr.TextField" sortMissingLast="true">
          <analyzer type="index">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
            <filter class="solr.TrimFilterFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
          </analyzer>
          <analyzer type="query">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
            <filter class="solr.TrimFilterFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
          </analyzer>
        </fieldType>
    
    

    And then you can use this fieldType in the same schema.xml (referencing the KeywordTokenizer we just defined)

    <field name="myField" type="single_token_string" indexed="true" stored="true" />
    

    By default, Solr uses the StandardTokenizer and thus, splits "alpha-ravi" on that hyphen into multiple tokens (thus, matching "alpha" and "ravi").

    Also, as an alternative you could run a query with a phrase as well (which will not be tokenized on spaces/delimiters). Possibly something likehttp:localhost:8983/solr/...fq=label:"alpha-ravi"

    Hope that helps. All the best!