Search code examples
pythonapielasticsearchpostmankibana

How to retrieve elasticsearch data from index based on timestamp?


I want to retrieve data from elasticsearch based on timestamp. The timestamp is in epoch_millis and I tried to retrieve the data like this:

{
  "query": {
    "bool": {
      "must":[ 
              {
                "range": {
                  "TimeStamp": {
                    "gte": "1632844180",
                    "lte": "1635436180"
                  }
                }
              }
      ]
    }
  },
  "size": 10
}

But the response is this:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

How can I retrieve data for a given period of time from a certain index?

The data looks like this:


    {
        "_index" : "my-index",
        "_type" : "_doc",
        "_id" : "zWpMNXcBTeKmGB84eksSD",
        "_score" : 1.0,
        "_source" : {
          "Source" : "Market",
          "Category" : "electronics",
          "Value" : 20,
          "Price" : 45.6468,
          "Currency" : "EUR",
          "TimeStamp" : 1611506922000        }

Also, the result has 10.000 hits when using the _search on the index. How could I access other entries? (more than 10.000 results) and to be able to choose the desired timestamp interval.


Solution

  • For your first question, assume that you have the mappings like this:

    {
        "mappings": {
            "properties": {
                "Source": {
                    "type": "keyword"
                },
                "Category": {
                    "type": "keyword"
                },
                "Value": {
                    "type": "integer"
                },
                "Price": {
                    "type": "float"
                },
                "Currency": {
                    "type": "keyword"
                },
                "TimeStamp": {
                    "type": "date"
                }
            }
        }
    }
    

    Then I indexed 2 sample documents (1 is yours above, but the timestamp is definitely not in your range):

    [{
        "Source": "Market",
        "Category": "electronics",
        "Value": 30,
        "Price": 55.6468,
        "Currency": "EUR",
        "TimeStamp": 1633844180000
    },
    {
        "Source": "Market",
        "Category": "electronics",
        "Value": 20,
        "Price": 45.6468,
        "Currency": "EUR",
        "TimeStamp": 1611506922000
    }]
    

    If you really need to query using the range above, you will first need to convert your TimeStamp field to seconds (/1000), then query based on that field:

    {
        "runtime_mappings": {
        "secondTimeStamp": {
          "type": "long",
          "script": "emit(doc['TimeStamp'].value.millis/1000);"
        }
      },
        "query": {
            "bool": {
                "must": [
                    {
                        "range": {
                            "secondTimeStamp": {
                                "gte": 1632844180,
                                "lte": 1635436180
                            }
                        }
                    }
                ]
            }
        },
        "size": 10
    }
    

    Then you will get the first document.

    About your second question, by default, Elasticsearch's max_result_window is only 10000. You can increase this limit by updating the settings, but it will increase the memory usage.

    PUT /index/_settings
    
    {
       "index.max_result_window": 999999
    }
    

    You should use the search_after API instead.