Search code examples
elasticsearchelasticsearch-querybooleanquery

How does "must" clause with an array of "match" clauses really mean?


I have an elasticsearch query which looks like this...

  "query": {
    "bool": {
      "must": [{
        "match": {"attrs.name": "username"}
      }, {
        "match": {"attrs.value": "johndoe"}
      }]
    }
  }

... and documents in the index that look like this:

{
  "key": "value",
  "attrs": [{
    "name": "username",
    "value": "jimihendrix"
  }, {
    "name": "age",
    "value": 23
  }, {
    "name": "alias",
    "value": "johndoe"
  }]
}

Which of the following does this query really mean?

  1. Document should contain either attrs.name = username OR attrs.value = johndoe
  2. Or, document should contain, both, attrs.name = username AND attrs.value = johndoe, even if they may match different elements in the attrs array (this would mean that the document given above would match the query)
  3. Or, document should contain, both, attrs.name = username AND attrs.value = johndoe, but they must match the same element in the attrs array (which would mean that the document given above would not match the query)

Further, how do I write a query to express #3 from the list above, i.e. the document should match only if a single element inside the attrs array matches both the following conditions:

  • attrs.name = username
  • attrs.value = johndoe

Solution

  • Must stands for "And" so a document satisfying all the clauses in match query is returned.

    Must will not satisfy point 1. Document should contain either attrs.name = username OR attrs.value = johndoe- you need a should clause which works like "OR"

    Whether Must will satisfy Point 2 or point 3 depends on the type of "attrs" field.

    If "attr" field type is object then fields are flattened that is no relationship maintained between different fields for array. So must query will return a document if any attrs.name="username" and attrs.value="John doe", even if they are not part of same object in that array.

    If you want an object in an array to act like a separate document, you need to use nested field and use nested query to match documents

    {
      "query": {
        "nested": {
          "path": "attrs",
          "inner_hits": {}, --> returns matched nested documents
          "query": {
            "bool": {
              "must": [
                {
                  "match": {
                    "attrs.name": "username"
                  }
                },
                {
                  "match": {
                    "attrs.value": "johndoe"
                  }
                }
              ]
            }
          }
        }
      }
    }
    

    hits in the response will contain all nested documents , to get all matched nested documents , inner_hits has to be specified