I have the project, in which I download many pages simultaneosly in many tasks, which are processed via ThreadPool
(size = 200). All this tasks are using the same method getPage
for downloading the page (with Apache Commons HttpClient and Apache Commons IO):
public static String getPage(String url)
throws IOException {
HttpUriRequest request = new HttpGet(url);
HttpResponse response = HTTP_CLIENT_BUILDER.build().execute(request);
try (InputStream content = response.getEntity().getContent()) {
return IOUtils.toString(content, "UTF-8");
}
}
while HTTP_CLIENT_BUILDER
is a static field initialized this way:
private static final HttpClientBuilder HTTP_CLIENT_BUILDER = HttpClients.custom()
.setDefaultRequestConfig(RequestConfig.custom()
.setSocketTimeout(SOCKET_TIMEOUT_MS) // 60_000
.setConnectTimeout(CONNECTION_TIMEOUT_MS) // 5_000
.build());
Problem statement: at some moment (when most of the tasks are finished) all the remaining threads are getting stuck at the native method SocketInputStream.socketRead0
, so jdb
is saying, that they're all running (hmm, yeah, I expect that behavior with native method running :-) ):
> threads
Group system:
(java.lang.ref.Reference$ReferenceHandler)0xac4 Reference Handler cond. waiting
(java.lang.ref.Finalizer$FinalizerThread)0xac5 Finalizer cond. waiting
(java.lang.Thread)0xac6 Signal Dispatcher running
(java.lang.Thread)0xac7 Java2D Disposer cond. waiting
Group main:
(java.lang.Thread)0xac9 pool-1-thread-5 running
(java.lang.Thread)0xaca pool-1-thread-12 running
(... 12 more threads from ThreadPool ...)
(java.lang.Thread)0xad7 DestroyJavaVM running
> where 0xac9
[1] java.net.SocketInputStream.socketRead0 (native method)
[2] java.net.SocketInputStream.read (SocketInputStream.java:150)
[3] java.net.SocketInputStream.read (SocketInputStream.java:121)
[4] sun.security.ssl.InputRecord.readFully (InputRecord.java:465)
[5] sun.security.ssl.InputRecord.read (InputRecord.java:503)
[6] sun.security.ssl.SSLSocketImpl.readRecord (SSLSocketImpl.java:961)
[7] sun.security.ssl.SSLSocketImpl.performInitialHandshake (SSLSocketImpl.java:1,363)
[8] sun.security.ssl.SSLSocketImpl.startHandshake (SSLSocketImpl.java:1,391)
[9] sun.security.ssl.SSLSocketImpl.startHandshake (SSLSocketImpl.java:1,375)
[10] org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket (SSLConnectionSocketFactory.java:275)
[11] org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket (SSLConnectionSocketFactory.java:254)
[12] org.apache.http.impl.conn.HttpClientConnectionOperator.connect (HttpClientConnectionOperator.java:117)
[13] org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect (PoolingHttpClientConnectionManager.java:314)
[14] org.apache.http.impl.execchain.MainClientExec.establishRoute (MainClientExec.java:363)
[15] org.apache.http.impl.execchain.MainClientExec.execute (MainClientExec.java:219)
[16] org.apache.http.impl.execchain.ProtocolExec.execute (ProtocolExec.java:195)
[17] org.apache.http.impl.execchain.RetryExec.execute (RetryExec.java:86)
[18] org.apache.http.impl.execchain.RedirectExec.execute (RedirectExec.java:108)
[19] org.apache.http.impl.client.InternalHttpClient.doExecute (InternalHttpClient.java:186)
[20] org.apache.http.impl.client.CloseableHttpClient.execute (CloseableHttpClient.java:82)
[21] org.apache.http.impl.client.CloseableHttpClient.execute (CloseableHttpClient.java:106)
[22] <package>.Utils.getPage (Utils.java:122)
[23...] <internal details>
> # the same picture for all of them
I don't understand, why this can happen, but I've found Java bug, which is maybe related to the issue. So maybe I'm not looking for real solution, but for some workaround.
Since the bug is filed against Linux, I should say, that I'm also using virtual machine running Ubuntu 14.04 x86_64
UPD: OK, what I've tried now is adding new timeout with setConnectionRequestTimeout
(just to make sure, it doesn't work) add finally
block withing getPage
:
...
try (InputStream content = response.getEntity().getContent()) {
return IOUtils.toString(content, "UTF-8");
} finally {
httpClient.getConnectionManager().closeIdleConnections(0, TimeUnit.NANOSECONDS);
}
Let's see, if this helps.
UPD2: this seems to help a little bit, but still, I have this permanentry running tasks getting stuck approximately once a day.
Unfortunately, I've failed to find any simple workaround (or the real solution), so I've manager to write my own workaround, I hope it'll help someone with that error:
Create class ConnectionSupervisor
:
private static class ConnectionsSupervisor extends Thread {
private Set<RequestEntry> streams = new CopyOnWriteArraySet<>();
public ConnectionsSupervisor() {
setDaemon(true);
setName("Connections supervisor");
}
@Override
public void run() {
while (true) {
try {
Thread.sleep(CONNECTIONS_SUPERVISOR_WAIT_MS);
} catch (InterruptedException ignored) {
}
long time = timestamp();
streams.stream().filter(entry -> time > entry.timeoutBorder).forEach(entry -> {
HttpUriRequest request = entry.request;
System.err.format("HttpUriRequest killed after timeout (%d sec.) exceeded: %s%n",
FULL_CONNECTION_TIMEOUT_S,
request);
request.abort();
});
}
}
public void addRequest(HttpUriRequest request) {
streams.add(new RequestEntry(timestamp() + FULL_CONNECTION_TIMEOUT_S, request));
}
public void removeRequest(HttpUriRequest request) {
streams.removeIf(entry -> entry.request == request);
}
private static class RequestEntry {
private long timeoutBorder;
private HttpUriRequest request;
public RequestEntry(long timeoutBorder, HttpUriRequest request) {
this.timeoutBorder = timeoutBorder;
this.request = request;
}
}
}
public static long timestamp() {
return Instant.now().getEpochSecond();
}
Somewhere there should be an instance of ConnectionSupervisor
, something like:
private static final ConnectionsSupervisor connectionsSupervisor = new ConnectionsSupervisor();
static {
connectionsSupervisor.start();
}
In something like getPage
method:
HttpUriRequest request = ...;
// ...
connectionsSupervisor.addRequest(request);
try (InputStream content = httpClient.execute(request).getEntity().getContent()) {
return IOUtils.toString(content, "UTF-8");
// or any other usage
} finally {
connectionsSupervisor.removeRequest(request);
// highly important!
}