Search code examples
elasticsearchretry-logicexponential-backoff

Best way to implement retry with exponential backoff with elasticsearch 7


I am upgrading from ES5.6 to ES7. In the past, we cloned ES5 repo and added custom code for retry with exponential backoff.

public void doRetryWithExponentialBackoff(BasicCallback mainAttempt, ExceptionHandlingCallback onFailure) {
        int failure = 0;
        int maxFailure = 6;
        while (true) {
            try {
                mainAttempt.execute();
                return;
            } catch (RuntimeException mainException) {
                try {
                    failure += 1;
                    if (failure <= maxFailure) {
                        Thread.sleep(((int) Math.pow(2, failure)) * 1000);
                    } else {
                        onFailure.execute(mainException);
                        return;
                    }
                } catch (InterruptedException interruptedException) {
                    throw new RuntimeException(interruptedException);
                }
            }
        }
    }

However, we don't want to do it again but at the same time I couldn't find any such functionality in ES7. What are your recommendations on implementing the retry policy?

I am also leveraging Pumba for chaos testing and application-ES related tests have failed miserably. For example, if I kill the ES container, or add a latency in the response time then the application crashes. With exponential backoff I intend to handle these cases too.

EDIT: I am using spring data framework to access ES7


Solution

  • Since ES is just a search engine, it cannot be expected to provide retry functionality at the server side. Providing such a support can hamper the ES' capability to keep functioning in an expected manner.

    Having said that, adding such support to the ES client should be a possibility because then the onus of carrying out the retries lies on your application and the ES engine can keep functioning at scale and serving all the incoming requests. This is exactly what AWS SDK clients do by default. And I am under the impression that the provided code above is an attempt of creating a wrapper on top of the ES client which IMO is also fine.