Search code examples
javamultithreadingjerseymarklogicjersey-client

HTTP Connections from MarkLogic Java Client API run into deadlock


After some stress testing our Java based middle ware accessing the MarkLogic server via the Java client API, I run in the situation where no more HTTP connections can be opened and a deadlock situation occurs. I am making use of one DatabaseClient shared instance, but create a JSONDocumentManager on each request (with a JacksonHandle for reading, no specific closing concern handled). Might there be an issue that connections are not closed properly or do I have to care on my own?

By looking at netstat at the point from which no more connections can be handled, I do see exactly 109 connections to the MarkLogic server (running on localhost:8040) in FIN_WAIT_2:

ffffff8045f765a0 31c91c01 tcp4       0      0  localhost.8040     localhost.65396    FIN_WAIT_2 

and the same number (109) of TCP connections in CLOSE_WAIT:

ffffff804ff83400 73965e73 tcp4       0      0  localhost.49286    localhost.8040     CLOSE_WAIT

I am using MarkLogic server 7.0.4 with Java 1.7 (Mac OSX 10.9.5) and MarkLogic client API 2.0.4. Here is the first part of the thread dump (there are 10 similar threads seeming to wait on the server response):

"http-nio-8080-exec-10" #31 daemon prio=5 os_prio=31 tid=0x00007fc61f344000 nid=0x7c03 waiting on condition [0x00000001265bb000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007a59cfff8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at org.apache.http.impl.conn.tsccm.WaitingThread.await(WaitingThread.java:159)
    at org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByRoute.java:398)
    at org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRoute.java:298)
    at org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(ThreadSafeClientConnManager.java:238)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:115)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
    at com.sun.jersey.client.apache4.ApacheHttpClient4Handler.handle(ApacheHttpClient4Handler.java:170)
    at com.marklogic.client.impl.DigestChallengeFilter.handle(DigestChallengeFilter.java:34)
    at com.sun.jersey.api.client.filter.HTTPDigestAuthFilter.handle(HTTPDigestAuthFilter.java:493)
    at com.sun.jersey.api.client.Client.handle(Client.java:648)
    at com.sun.jersey.api.client.WebResource.handle(WebResource.java:680)
    at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
    at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:507)
    at com.marklogic.client.impl.JerseyServices.getDocumentImpl(JerseyServices.java:612)
    at com.marklogic.client.impl.JerseyServices.getDocument(JerseyServices.java:568)
    at com.marklogic.client.impl.DocumentManagerImpl.read(DocumentManagerImpl.java:270)
    at com.marklogic.client.impl.DocumentManagerImpl.read(DocumentManagerImpl.java:204)
    at com.marklogic.client.impl.DocumentManagerImpl.read(DocumentManagerImpl.java:164)
    at com.acme.dashboard.service.ReportMetadataRepository.getByName(ReportMetadataRepository.java:64)

Further details of the stack trace left out for better readability. After looking at JerseyServices I tried also to tweak the following system properties (unfortunately without any improvements):

com.marklogic.client.maximumRetrySeconds: 3 (default: 120)
com.marklogic.client.minimumRetries: 3 (default: 8)

Solution

  • It sounds like you may be encountering a bug with JacksonHandle and TuplesHandle not closing their connections (github issue #89). This was fixed in Java Client API 2.0.5. Are you able to run your tests on a 7.0-5 instance of ML Server and use the 2.0.5 version of the Java Client API?