Search code examples
elasticsearchelasticsearch-queryelasticsearch-nested

Query an Array of Objects to be an Exact Match in ElasticSearch


Consider the following index

PUT /orchards
{
  "mappings": {
    "properties": {
      "fruits": {
        "type": "nested",
        "properties": {
          "farms": {
            "type": "long"
          },
          "visits": {
            "type": "long"
          },
          "counts": {
            "type": "long"
          }
        }
      }
    }
  }
}

with these documents

POST orchards/_doc
{
  "fruits" : [
    {
    "farms" : 1,
    "visits" : 1,
    "counts" : [1]
    }
  ]
}
POST orchards/_doc
{
  "fruits" : [
    {
    "farms" : 1,
    "visits" : 1,
    "counts" : [1]
    },
    {
    "farms" : 2,
    "visits" : 2,
    "counts" : [1,2]
    }
  ]
}

Now, I want to write a query to only return the document having

"fruits" : [
    {
    "farms" : 1,
    "visits" : 1,
    "counts" : [1]
    }
  ]

I tried it by writing the following query

GET /orchards/_search
{
  "query": {
    "nested": {
      "path": "fruits",
      "query": {
        "bool": {
          "must": [
            {
              "terms": {
                "fruits.counts": [
                  1
                ]
              }
            },
            {
              "match": {
                "fruits.visits": 1
              }
            },
            {
              "match": {
                "fruits.farms": 1
              }
            }
          ]
        }
      }
    }
  }
}

However, the result hits contained both the documents i.e it does not perform an exact match.

I want to avoid comparing lengths using scripts because it's an expensive operation - as explained in this post

Tried some other variations as well, but they all seemed to give either no hits or both the documents as result.

References

Query on Nested Type

Query on Array Type


Solution

  • Tldr;

    This is the expected behaviour. In the nested document, you got a match, so the whole document is returned.

    If you are only interested in the information from the match you could use something like inner_hits

    Solution

    Notice the _source: false and the inner_hits:{}

    GET /orchards/_search
    {
      "_source": "false", 
      "query": {
        "nested": {
          "path": "fruits",
          "inner_hits": {},
          "query": {
            "bool": {
              "must": [
                {
                  "terms": {
                    "fruits.counts": [
                      1
                    ]
                  }
                },
                {
                  "match": {
                    "fruits.visits": 1
                  }
                },
                {
                  "match": {
                    "fruits.farms": 1
                  }
                }
              ]
            }
          }
        }
      }
    }