Search code examples
elasticsearchquerydsl

Elasticsearch avoid maxClauseCount by refactoring bool match clauses


I have an elasticsearch query that uses a lot of match clauses (around 1300) since I have a very large data set. ES throws an error saying this:

"error": {
    "root_cause": [
      {
        "type": "too_many_clauses",
        "reason": "too_many_clauses: maxClauseCount is set to 1024"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
}

I did some research online and found that it is not a good practice to increase the maxClauseCount. Some from elastic mentioned that I should rewrite my queries as a terms query rather than bool. Here is an example of my query. How do I rewrite it so that I don't hit maxClauseCount?

{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "city": "dallas"
          }
        },
        {
          "match": {
            "city": "london"
          }
        },
        {
          "match": {
            "city": "singapore"
          }
        },
        {
          "match": {
            "city": "prague"
          }
        },
        {
          "match": {
            "city": "ontario"
          }
        },
        ...........................................
        ...........................................
        ...........................................
      ]
     }
    }
}

Solution

  • POST test/_search
    {
      "query": {
        "bool": {
          "must_not": [
            {
              "terms": {
                "city": [
                  "prague",
                  "london",
                  "chicago",
                  "singapore",
                  "new york",
                  "san francisco",
                  "mexico city",
                  "baghdad"
                ]
              }
            }
          ]
        }
      }
    }