Search code examples
iisnetwork-programmingfirewall.net-remoting

How can we troubleshoot intermittent "An existing connection was forcibly closed" errors caused by a Cisco CSS


We have the "standard" three tier architecture with our middle tier hosted in IIS and accessed via .net remoting. These errors occur between our web and web services servers (front tier) that are remoting to the app servers (middle tier). We'll get this error 3-10 times a day out of ~130K total calls in the day.

The exception and stack trace always look similar to this:


Exception Type: System.Net.WebException
Message: The underlying connection was closed: An unexpected error occurred on a receive.

Server stack trace: 
   at System.Runtime.Remoting.Channels.Http.HttpClientTransportSink.ProcessResponseException(WebException webException, HttpWebResponse& response)
   at System.Runtime.Remoting.Channels.Http.HttpClientTransportSink.ProcessMessage(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream)
   at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)

Exception rethrown at [0]: 
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at XXXXX.BusinessFacade.Interface.XXXXInterface.SubmitXXXX(
   at XXX.XXXXWebServicesLibrary.XXXXService.CreateXXXXXX.RunXXXXMethod()
   at XXX.XXXXWebServicesLibrary.XXXXService.XXXXXXMethod`2.RunMethod()
   at XXX.XXXXWebServicesLibrary.XXXXXWebMethod`2.Run()HandleReturnMessage()
Inner Exception: 

Exception Type: System.IO.IOException
Message: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.PooledStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.Connection.SyncRead(HttpWebRequest request, Boolean userRetrievedStream, Boolean probeRead)Read()
Inner Exception: 

Exception Type: System.Net.Sockets.SocketException
Message: An existing connection was forcibly closed by the remote host
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)Receive()

There's no particular remoting call that causes this to happen, it can be any of them which seems to rule out any sort of application specific cause. The only common denominator is the "Exception Type: System.Net.Sockets.SocketException Message: An existing connection was forcibly closed by the remote host" portion of the error.

The front and middle tiers are separated by a firewall and we are also utilizing a VIP device. I strongly suspect an issue with our network/firewall configuration but our network guys are just scratching their heads and not offering any suggestions.

Although a 0.003% failure rate may seem insignificant, we have partners that scrutinize our communications very carefully and I am just waiting for this to become an issue they notice. I don't want to have to say "I don't know" when that time comes.

Does anyone have any ideas on how I could provide more information or any suggestions I could make to our network guys to get this resolved?


Solution

  • The problem was the Cisco CSS. We determined this by pointing the tier 1 servers directly to the tier 2 servers and going 48 hours without observing the problem. Once we determined it was the CSS, we corrected this problem by adjusting the insanely low default value for this parameter:

    "Default flow inactivity timeouts, in seconds, for the TCP or UDP port. If a flow is idle for the amount of time specified in the timeout value, the CSS tears down the flow and reclaims the flow resources."

    We set this to 84 (which is 84 16-second increments). Since the default keep-alive for HTTP is 120 seconds, the default value was too low.