Search code examples
elasticsearchelasticsearch-6

Elasticsearch word order


I have indexed documents using standard analyzer

foo 1 bar
foo 2 bar
foo 3 bar

and so on..

When I make a mach query like "asdf foo 1 bar 2" then foo 2 bar has higher score than foo 1 bar although query string contains phrase "foo 1 bar"

How can I construct my query so that it takes word order into account? Main problem is that query string may contain more words than documents do.


Solution

  • You should look into using "shingles". They're like mini-phrases that help improve relevance by grouping adjacent terms into pairs. Then, if you get multiple shingle matches, that improves your relevance over another document that only has individual word matches.

    Original for doc 1

    "foo 1 bar"
    

    Shingles for doc 1

    "foo 1", "1 bar"
    

    So for the query asdf foo 1 bar 2, you'll get matches on the shingle foo 1 and 1 bar for those parts of the query, which will increase the relevance of that first document over the second.

    Learn more about shingles in the Elasticsearch Docs.

    You should probably create multiple field mappings for this field so you get the benefits of shingles, as well as the standard text analysis. That process is also well-documented in the docs, and you can create another question here if you get stuck.