Search code examples
apachesolrwildcardvelocity

Apache Solr automatically search with (*)


Good evening,

when I search for the word "app" it dont show the word "apple". But if I search for "app*", it show "apple" and "app". I dont want to write "*" in the search bar. How can I do this if I only search for "app" and it shows "apple" and "app"?

  <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>

I tried to add <filter class="solr.ReversedWildcardFilterFactory"/> but it didnt work.

Can someone help me?

I use Apache Solr 6.4.1

Sry for my bad english.


Solution

  • Use EdgeNGramFilterFactory

    EdgeNGramFilterFactory :

    This filter generates edge n-gram tokens of sizes within the given range.

    Arguments:

    • minGramSize: (integer, default 1) The minimum gram size.
    • maxGramSize: (integer, default 1) The maximum gram size.

    Example :

    If we use minGramSize = 1 and maxGramSize = 4 then

    In: "four score"
    Tokenizer to Filter: "four", "score"
    Out: "f", "fo", "fou", "four", "s", "sc", "sco", "scor"

    For your case you can use the below schema :

    <fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
          <tokenizer class="solr.StandardTokenizerFactory"/>
          <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
          <filter class="solr.LowerCaseFilterFactory" />
          <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="200"/>
         </analyzer>
        <analyzer type="query">
          <tokenizer class="solr.StandardTokenizerFactory"/>
          <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
          <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
          <filter class="solr.LowerCaseFilterFactory" />
        </analyzer>
    </fieldType>
    

    And update your fieldType to text_ngram Ex.

    <field name="name" type="text_ngram" indexed="true" stored="false" multiValued="true"/>
    

    Note : Don't forget to reload the core and reindex data