Search code examples
pythonelasticsearchelasticsearch-dsl

Trouble setting request specific timeout in Elasticsearch DSL


I'm trying to set a timeout for a specific request using elasticsearch_dsl. I've tried the following:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, F

...

def do_stuff(self, ids):
    client = Elasticsearch(['localhost'], timeout=30)
    s = Search(using=client,
               index= 'my_index',
               doc_type=['my_type'])
    s = s[0:100]
    f = F('terms', my_field=list(ids))
    s.filter(f)

    response = s.execute()
    return response.hits.hits

Notes:

  • When I change the doc_type to a type containing a million entities, the query runs fine.
  • When I point the doc_type to a few billion entities, I get a timeout error showing the default timeout of 10 seconds.

From the elasticsearch_dsl docs I even tried setting the default connection timeout:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, F
from elasticsearch_dsl import connections

connections.connections.create_connection(hosts=['localhost'], timeout=30)

I still received the 10 second timeout error.


Solution

  • So for some reason adding the parameter via .params() seems to do the trick:

    s = Search(using=client,
               index= 'my_index',
               doc_type=['my_type'])
        .params(request_timeout=30)
    

    The really interesting part is that the query now takes less than a second to run and the index is only on a single node.