Search code examples
javaweb-serviceshttpweblogiccxf

Why does an Apache CXF 2.7.8 in WebLogic Application Server 12c client ignore FIN from the server?


I have an Apache CXF 2.7.8 consumer calling another SOAP web service.

In my development environment (Tomcat 6.0, jdk1.7.0_51, Windows 7) everything works charmingly.

However; when I deploy the code to a test environment (WebLogic 12.1, jdk1.7.0_51, RHEL 6), I get a javax.xml.ws.WebServiceException: Could not send Message exception, caused by java.net.SocketTimeoutException: Read time out after x millis on every second request.

Both the development and test instances call the exact same server.

When I perform a network trace, I see that CXF sends many requests using the same socket connection (thanks to HTTP Keep-Alive). Eventually, the server sends a FIN indicating that the client should stop using this connection (and establish a new one, if need be). The client acknowledges the FIN, but then continues to send the next request on the same socket, despite having been told to disconnect (and acknowledging that directive). The server then sends an RST (as expected), telling the client to go away. The client then tries again. Eventually, enough time has elapsed that we reach the Read timeout, and the SocketTimeoutException above is thrown.

(As an aside: On the windows development platform, the client honors the FIN and establishes a new socket connection for the next request).

When I disable HTTP Keep-Alive (using the instructions here), the Server sends the FIN after only a single request is sent by the client (exactly as it should). The client still acknowledges the FIN with an ACK for that frame, and then boldly continues trying to use that socket.

I would love to have HTTP Keep-Alive working, but I would settle for the darn thing working without it.

Are there any recommended solutions or next steps for troubleshooting?


Solution

  • tldr; Add -Dhttp.keepalivecache.sockethealthchecktimeout=10 to the JVM arguments for the WebLogic server.

    Here's what we learned eventually:

    The client (Apache 2.7.8 on WebLogic 12c) was sending SOAP HTTP requests to the server (not a WebLogic server).

    The server was (at least in some cases) failing to send a 'Connection' header in the response. This resulted in WebLogic not knowing whether it could reuse the connection or not. When it tried to reuse a connection that had been closed by the server, we got the error.

    WebLogic has a parameter that can instruct it to perform a health check on the reused connection prior to reusing it, and evict it from the pool if it fails the health check. Setting the system property 'http.keepalivecache.sockethealthchecktimeout' to a very low value (say 10, for 10 milliseconds) fixed the problem.