Search code examples
elasticsearchfull-text-searchelastic-stackamazon-opensearch

Elasticsearch Querying Nested Structure


I try to query documents that have nested objects:

{
  "name":"Adam",
  "cities":[
     {
        "name":"California",
        "activities":[
           {
              "name":"Hicking"
           },
           {
              "name":"Camping"
           },
           {
              "name":"Festival"
           }
        ]
     },
     {
        "name":"Miami",
        "activities":[
           {
              "name":"Festival"
           },
           {
              "name":"Diving"
           },
           {
              "name":"Surfing"
           }
        ]
     }
  ]
}

I want to retrieve all cities based on an activity, for example, I want to retrieve all cities where Adam wants to go to festivals. The query result should be California and Miami and if we search about what is the town where Adam will do hicking, the query result should be California.

This is what it SHOULD be, but the actual result is that when I search about the cities where Adam will do hicking, the query result was California and Miami!!

This is the query I used:

GET index/_search
{
  "query": 
        {
         "bool": {"filter": [
                    {"term": {
                         "cities.activities.name.keyword": "Hicking"
                             }}
                           ]}
                },"_source":["cities.name"]
      }

And this is the search result:

    "_index" : "index",
    "_type" : "prod",
    "_id" : "1",
    "_score" : 0.0,
    "_source" : {
      "cities" : [
        {
          "name" : "California"
        },
        {
          "name" : "Miami"
        }
      ]
    }

How can make the query in the right way for such a data structure?

Will nested field helps?

Should I redesign the data structure?


Solution

  • You should make cities array nested and send a nested query to filter cities.activities.name. And to get sub document that matched to your query, you should use inner_hits because elasticsearch returns whole document by default and documents not matching to your filter will continue to return in the result.

    This is an example query and you should check inner_hits in the result:

    {
      "_source": ["cities.name"], 
      "query": {
        "nested": {
          "path": "cities",
          "query": {
            "term": {
              "cities.activities.name.keyword": {
                "value": "Hicking"
              }
            }
          },
          "inner_hits": {
            "_source": ["cities.name"]
          }
        }
      }
    }