Search code examples
javaelasticsearchelasticsearch-java-api

How do I create an ElasticSearch query without knowing what the field is?


I have someone putting JSON objects into Elasticsearch for which I do not know any fields. I would like to search all the fields for a given value using a matchQuery.

I understand that the _all is deprecated, and the copy_to doesn't work because I don't know what fields are available beforehand. Is there a way to accomplish this without knowing what fields to search for beforehand?


Solution

  • Yes, you can achieve this using a custom _all field (which I called my_all) and a dynamic template for your index. Basically, this idea is to have a generic mapping for all fields with a copy_to setting to the my_all field. I've also added store: true for the my_all field but only for the purpose of showing you that it works, in practice you won't need it.

    So let's go and create the index:

    PUT my_index
    {
      "mappings": {
        "_doc": {
          "dynamic_templates": [
            {
              "all_fields": {
                "match": "*",
                "mapping": {
                  "copy_to": "my_all"
                }
              }
            }
          ],
          "properties": {
            "my_all": {
              "type": "text",
              "store": true
            }
          }
        }
      }
    }
    

    Then index a document:

    PUT my_index/_doc/1
    {
      "test": "the cat drinks milk",
      "age": 10,
      "alive": true,
      "date": "2018-03-21T10:00:00.123Z",
      "val": ["data", "data2", "data3"]
    }
    

    Finally, we can search using the my_all field and also show its content (because we store its content) in addition to the _source of the document:

    GET my_index/_search?q=my_all:cat&_source=true&stored_fields=my_all
    

    And the result is shown below:

      {
        "_index": "my_index",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "test": "the cat drinks milk",
          "age": 10,
          "alive": true,
          "date": "2018-03-21T10:00:00.123Z",
          "val": [
            "data",
            "data2",
            "data3"
          ]
        },
        "fields": {
          "my_all": [
            "the cat drinks milk",
            "10",
            "true",
            "2018-03-21T10:00:00.123Z",
            "data",
            "data2",
            "data3"
          ]
        }
      }
    

    So given you can create the index and mapping of your index, you'll be able to search whatever people are sending to it.