Search code examples
solr

How to run solr query with imcomplete last term?


I am trying to write a query, such that for any of the following queries

  1. Elephant
  2. Elephant is the bigge
  3. Elephant is the biggest land mammal.

Solr should be able to return following result Elephant is the biggest land mammal.

I've tried following two methods

  1. field:"query*"
  2. field:"Elephant is the" && field:"bigge*"

First query isn't able to handle the last incomplete words like in the query Elephant is the bigge. While second query is able to handle it, it isn't always reliable. Is there a better approach to it.

PS: I am using text_general for that field if that's helpful


Solution

  • If this is a common query against that field and the field content is exactly matching a prefix of your query, consider using a string field instead - or a TextField with a KeywordTokenizer and a LowercaseFilter if you want it to be case insensitive - it'll be more efficient than trying to make text_general work as you want.

    There might be a few issues with stopwords, but I think the ComplexPhraseQueryParser should work as you want to:

    q={!complexphrase inOrder=true}field:"Elephant is the bigge*"
    

    Make sure to append * to the last term in your query so that the last term is used as a wildcard.

    You should also be aware that wildcards are special in Lucene/Solr, as they can't use all the regular filters in an analysis chain as defined for a field - so you might get surprising results when you have stemming/synonyms/etc. for that field. In that case it's better to use a dedicated field type that you can do a prefix search against, as mentioned at the start.