Search code examples
arrayselasticsearchelasticsearch-dsl

ElasticSearch: search multiple elements in array of object


I'm on Elastic Search 6.8.22

I have multiple users and each one has multiple papers ("valid" or not):

{"name":"Amy",
    "papers":[
        {"type":"idcard", "country":"fr", "valid":"no"},
        {"type":"idcard", "country":"us", "valid":"yes"}
    ]}

{"name":"Brittany",
    "papers":[
        {"type":"idcard", "country":"fr", "valid":"no"},
        {"type":"idcard", "country":"us", "valid":"no"}
    ]}

{"name":"Chloe",
    "papers":[
        {"type":"idcard", "country":"fr", "valid":"yes"},
        {"type":"idcard", "country":"us", "valid":"no"}
    ]}

I'm trying to find only user with a paper: "valid" for "fr":

{"query": {
    "bool": {
      "filter": [
              {"match":{"papers.valid": "yes"}},
              {"match":{"papers.country": "fr"}}
      ]}}}

It returns Chloe, which is fine (she has a paper which is both "valid" and "fr"). But it also returns Amy; because she has one "valid" paper and another one which is "fr". This is due to the fact that ES doesn't understand array of objects and flattens everything into values with arrays (as far as I understand).

I've tried using "combined term queries" from this link, but I guess it only works for arrays of "primitive" (not complex objects).

I've seen that I can transform arrays into nested objects to do what I need, but it seems to be overcomplicated and would slow down the queries (because of hidden joins).

My question is: Is there any way I can search if a document has in its array of objects, one that match multiple criteria at the same time ?

(Originally, I wanted a query that checks if every "papers" in the array matched criteria, but that seems impossible, ex. all papers of type "idcard" must be "valid")


Solution

  • You need to define papers as a nested field in the mapping, then you can run a nested search on it

    https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

    So if for example, your mapping will be this:

    {
      "mappings": {
        "properties": {
          "name": {
            "type": "keyword"
          },
          "papers": {
            "type": "nested",
            "properties": {
              "type": {
                "type": "keyword"
              },
              "country": {
                "type": "keyword"
              },
              "valid": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
    

    this query will work

    {
      "query": {
        "nested": {
          "path": "papers",
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "papers.valid": "yes"
                  }
                },
                {
                  "term": {
                    "papers.country": "fr"
                  }
                }
              ]
            }
          }
        }
      }
    }