Search code examples
javaelasticsearchmatchspring-data-elasticsearch

Elasticsearch search all the documents, in order of relavancy score


i have a complex match query in the list of news headlines document,

{
  "bool" : {
    "should" : [
      {
        "multi_match" : {
          "query" : " Reliance gets shareholders, creditors nod for hiving off O2C business into separate unit - The HinduBillionaire Mukesh Ambani's Reliance Industries Ltd on Friday said it has secured approval of its shareholders and creditors for hiving off its oil-to-chemical (O2C) business into a separate unit.",
          "fields" : [
            "article.description^1.0",
            "article.title^1.0"
          ],
          "type" : "best_fields",
          "operator" : "OR",
          "slop" : 0,
          "prefix_length" : 0,
          "max_expansions" : 50,
          "zero_terms_query" : "NONE",
          "auto_generate_synonyms_phrase_query" : true,
          "fuzzy_transpositions" : true,
          "boost" : 3.0
        }
      },
      {
        "match" : {
          "article.author" : {
            "query" : " PTI",
            "operator" : "OR",
            "prefix_length" : 0,
            "max_expansions" : 50,
            "fuzzy_transpositions" : true,
            "lenient" : false,
            "zero_terms_query" : "NONE",
            "auto_generate_synonyms_phrase_query" : true,
            "boost" : 1.0
          }
        }
      },
      {
        "match" : {
          "article.source.name" : {
            "query" : " The Hindu",
            "operator" : "OR",
            "prefix_length" : 0,
            "max_expansions" : 50,
            "fuzzy_transpositions" : true,
            "lenient" : false,
            "zero_terms_query" : "NONE",
            "auto_generate_synonyms_phrase_query" : true,
            "boost" : 1.0
          }
        }
      }
    ],
    "adjust_pure_negative" : true,
    "boost" : 1.0
  }
}

The problem is that elasticsearch returns only the relavant documents, I want as many documents as possible, in decreasing order of relavance score. Its fine to return all the documents, but ordering should be in that order. I could not find a better way elasticsearch returns on 5-10 documents from the news repository, while I have 1000s of article.


Solution

  • By default elasticsearch return only 10 documents. If you want to return more than 10 documents, you need to set the size parameter.

    The modified query will be

    {
      "size": 1000,         // note this
      "query": {
        "bool": {
          "should": [
            {
              "multi_match": {
                "query": " Reliance gets shareholders, creditors nod for hiving off O2C business into separate unit - The HinduBillionaire Mukesh Ambani's Reliance Industries Ltd on Friday said it has secured approval of its shareholders and creditors for hiving off its oil-to-chemical (O2C) business into a separate unit.",
                "fields": [
                  "article.description^1.0",
                  "article.title^1.0"
                ],
                "type": "best_fields",
                "operator": "OR",
                "slop": 0,
                "prefix_length": 0,
                "max_expansions": 50,
                "zero_terms_query": "NONE",
                "auto_generate_synonyms_phrase_query": true,
                "fuzzy_transpositions": true,
                "boost": 3.0
              }
            },
            {
              "match": {
                "article.author": {
                  "query": " PTI",
                  "operator": "OR",
                  "prefix_length": 0,
                  "max_expansions": 50,
                  "fuzzy_transpositions": true,
                  "lenient": false,
                  "zero_terms_query": "NONE",
                  "auto_generate_synonyms_phrase_query": true,
                  "boost": 1.0
                }
              }
            },
            {
              "match": {
                "article.source.name": {
                  "query": " The Hindu",
                  "operator": "OR",
                  "prefix_length": 0,
                  "max_expansions": 50,
                  "fuzzy_transpositions": true,
                  "lenient": false,
                  "zero_terms_query": "NONE",
                  "auto_generate_synonyms_phrase_query": true,
                  "boost": 1.0
                }
              }
            }
          ],
          "adjust_pure_negative": true,
          "boost": 1.0
        }
      }
    }