Search code examples
elasticsearchelasticsearch-dsl

Nested Query in Elastic Search


I have a schema in elastic search of this form:

{
   "index1" : {
      "mappings" : {
         "properties" : {
            "key1" : {
               "type" : "keyword"
            },
            "key2" : {
               "type" : "keyword"
            },
            "key3" : {
               "properties" : {
                  "components" : {
                     "type" : "nested",
                     "properties" : {
                        "sub1" : {
                           "type" : "keyword"
                        },
                        "sub2" : {
                           "type" : "keyword"
                        },
                        "sub3" : {
                           "type" : "keyword"
                        }
                     }
                  }
               }
            }
         }
      }
   }
}

and then the data stored in elastic search would be of the format:

{
   "_index" : "index1",
   "_type" : "_doc",
   "_id" : "1",
   "_score" : 1.0,
   "_source" : {
      "key1" : "val1",
      "key2" : "val2",
      "key3" : {
         components : [
            {
               "sub1" : "subval11",
               "sub3" : "subval13"
            },
            {
               "sub1" : "subval21",
               "sub2" : "subval22",
               "sub3" : "subval23"
            },
            {
               "sub1" : "subval31",
               "sub2" : "subval32",
               "sub3" : "subval33"
            }
         ]
      }
   }
}

As you can see that the sub1, sub2 and sub3 might not be present in few of the objects under key3.

Now if I try to write a query to fetch the result based on key3.sub2 as subval22 using this query

GET index1/_search
{
  "query": {
    "nested": {
      "path": "components",
      "query": {
        "bool": {
          "must": [
            {
              "match": {"key3.sub2": "subval22"}
            }
          ]
        }
      }
    }
  }
}

I always get the error as

{
  "error": {
    "root_cause": [
      {
        "type": "query_shard_exception",
        "reason": "failed to create query: {...}",
        "index_uuid": "1",
        "index": "index1"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "index1",
        "node": "1aK..",
        "reason": {
          "type": "query_shard_exception",
          "reason": "failed to create query: {...}",
          "index_uuid": "1",
          "index": "index1",
          "caused_by": {
            "type": "illegal_state_exception",
            "reason": "[nested] failed to find nested object under path [components]"
          }
        }
      }
    ]
  },
  "status": 400
}

I understand that since sub2 is not present in all the objects under components, this error is being thrown. I am looking for a way to search such scenarios such that it matches and finds all the objects in the array. If a value is matched, then this doc should get returned.

Can someone help me to get this working.


Solution

  • You made mistake while defining your schema, below schema works fine, Note I just defined key3 as nested. and changed the nested path to key3

    Index def

    {
        "mappings": {
            "properties": {
                "key1": {
                    "type": "keyword"
                },
                "key2": {
                    "type": "keyword"
                },
                "key3": {
                    "type": "nested"
                }
            }
        }
    }
    

    Index you sample doc without any change

     {
      "key1": "val1",
      "key2": "val2",
      "key3": {
        "components": [ --> this was a diff
          {
            "sub1": "subval11",
            "sub3": "subval13"
          },
          {
            "sub1": "subval21",
            "sub2": "subval22",
            "sub3": "subval23"
          },
          {
            "sub1": "subval31",
            "sub2": "subval32",
            "sub3": "subval33"
          }
        ]
      }
    }
    

    Searching with your criteria

     {
        "query": {
            "nested": {
                "path": "key3", --> note this
                "query": {
                    "bool": {
                        "must": [
                            {
                                "match": {
                                    "key3.components.sub2": "subval22" --> note this
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
    

    This brings the proper search result

       "hits": [
          {
            "_index": "so_nested_61200509",
            "_type": "_doc",
            "_id": "2",
            "_score": 0.2876821,
            "_source": {
              "key1": "val1",
              "key2": "val2",
              "key3": {
                "components": [ --> note this
                  {
                    "sub1": "subval11",
                    "sub3": "subval13"
                  },
                  {
                    "sub1": "subval21",
                    "sub2": "subval22",
                    "sub3": "subval23"
                  },
                  {
                    "sub1": "subval31",
                    "sub2": "subval32",
                    "sub3": "subval33"
                  }
                ]
    

    Edit:- Based on the comment from OP, updated sample doc, search query and result.