Search code examples
javarestelasticsearchapache-httpasyncclient

OutOfMemoryError with elasticsearch REST Java client via apache http nio


We are using the elasticsearch REST Java client (we are on Java 7 so can't use the normal elasticsearch Java client) to interact with our elasticsearch servers. This all works fine except for when we are trying to do an intial indexing of about 1.3m documents. This runs for a while but after a few hundred thousand documents we are getting a

20/06 21:27:33,153 ERROR [cid=51][stderr][write:71] (pool-837116-thread-1) Exception in thread "pool-837116-thread-1" java.lang.OutOfMemoryError: unable to create new native thread
20/06 21:27:33,154 ERROR [cid=51][stderr][write:71] (pool-837116-thread-1)  at java.lang.Thread.start0(Native Method)
20/06 21:27:33,154 ERROR [cid=51][stderr][write:71] (pool-837116-thread-1)  at java.lang.Thread.start(Thread.java:693)
20/06 21:27:33,154 ERROR [cid=51][stderr][write:71] (pool-837116-thread-1)  at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:334)
20/06 21:27:33,154 ERROR [cid=51][stderr][write:71] (pool-837116-thread-1)  at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:194)
20/06 21:27:33,154 ERROR [cid=51][stderr][write:71] (pool-837116-thread-1)  at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
20/06 21:27:33,155 ERROR [cid=51][stderr][write:71] (pool-837116-thread-1)  at java.lang.Thread.run(Thread.java:724)

followed by

java.lang.IllegalStateException: Request cannot be executed; I/O reactor status: STOPPED
    at org.apache.http.util.Asserts.check(Asserts.java:46)
    at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90)
    at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123)
    at org.elasticsearch.client.RestClient.performRequestAsync(RestClient.java:343)
    at org.elasticsearch.client.RestClient.performRequestAsync(RestClient.java:325)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:218)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:191)

As you can see the Elasticsearch REST client is using apache http nio. What I found odd is that the nio library is creating a thread for every single request (or connection?). From the log above you can see the thread (pool-837116-thread-1). There are also lots of I/O dispatcher threads with increasing numbers.

The total number of live threads doesn't seem to change much though. So it seems rather than reusing threads a (or two actually) new thread is created for each connect cycle. The upload is basically:

1. Create client

    restClient = RestClient.builder(new HttpHost(host.getHost(),host.getPort(),host.getProtocol())/*,new HttpHost(host.getHost(),host.getPort()+1,host.getProtocol())*/)
                            .setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
                                @Override
                                public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
                                    return httpClientBuilder
                                            .setDefaultCredentialsProvider(credsProvider)
                                                                                }
                            }).setMaxRetryTimeoutMillis(30000).build();

2. Send request with json body and close client

        try{
            HttpEntity entity = new NStringEntity(json,ContentType.APPLICATION_JSON);
            Response indexResponse = restClient.performRequest("PUT", endpoint, parameters,entity,header);
            log.debug("Response #0 #1", indexResponse,indexResponse.getStatusLine());
            log.debug("Entity #0",indexResponse.getEntity());

        }finally{
            if(restClient!=null){
                log.debug("Closing restClient #0", restClient);
                restClient.close();
            }
        }

Is this normal? Why isn't apache nio reusing threads? Is this a problem with the elasticsearch REST client, apache nio or my code? I call close on the restClient, not sure what else I am supposed to do.

I've tried to set the thread count to just 1 on the IO Reactor:

restClient = RestClient.builder(new HttpHost(host.getHost(),host.getPort(),host.getProtocol())/*,new HttpHost(host.getHost(),host.getPort()+1,host.getProtocol())*/)
                            .setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
                                @Override
                                public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
                                    return httpClientBuilder
                                            .setDefaultCredentialsProvider(credsProvider)
                                            .setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(1).build()); //set to one thread
                                }
                            }).setMaxRetryTimeoutMillis(30000).build();

but that didn't change anything regarding the reuse of threads.


Solution

  • I've found the reason for the OutOfMemoryError. Although I was using a try - finally block in which I would close the client - an exception was thrown outside of that block (the block didn't cover everything D'oh). But it still looks wrong that so many threads are being created (although the number of overall threads does not significantly increase).