Search code examples
wcfnettcpbindingservicehost

Service becomes unresponsive with many sockets in CLOSE_WAIT state


I have a WCF service with a NetTcpBinding running with about 100 clients. The clients regulary poll information from the server and after a while the service does not respond anymore.

Looking at netstat, I can see many connections that are in the CLOSE_WAIT state.

This is my binding:

<netTcpBinding>
  <binding  name="default" maxReceivedMessageSize="2147483647" maxBufferPoolSize="2147483647" maxConnections="10000">
    <readerQuotas maxDepth="2147483647" maxStringContentLength="2147483647" maxArrayLength="2147483647" maxBytesPerRead="2147483647" maxNameTableCharCount="2147483647" />
  </binding>
</netTcpBinding>

I have also tried to change the values of closeTimeout from the default of 00:01:00 to 00:00:10, but with no effect.

The machine is a Windows Server 2008 R2 64bit.

Update:

I have added a ServiceThrottlingBehavior now, but the result is still the same.

new ServiceThrottlingBehavior
{
 MaxConcurrentCalls = 1000,
 MaxConcurrentInstances = 1000,
 MaxConcurrentSessions = 1000
};

Update2

I have set the SessionMode to NotAllowed and changed the binding to streamed.

Any ideas what I could do to improve performance or to figure out the problem?


Solution

  • From your description, it seems: 1. initially the clients were able to connect to your server with no problem, so this rules out configuration problem 2. After a while server stopped responding, but you didn't say how long, and how big is the request rate, and whether the server stopped responding at all, or only intermittently responding. Based on this one possibility is that something is wrong on the server side. Did you noticed anything unusual on the server side? Things to look for is:

    1. Thread count -- was the thread pool being depicted (as some settings may set a cap on thread pool thread)? Especially try a fresh launch of the server and observe the thread count till it stopped responding, any pattern there? You may have dead locks, long blocking operations etc. which holds thread for too long.
    2. Memory -- is there a problem with memory leaks?
    3. Is it a self hosting service? Do you have proper code to catch ServiceHost.Faulted event (and restart the service)? If a ServiceHost is faulted, it'll not respond to any requests.
    4. See what WCF performance counter tells you, especially the queue size and number of active connections. From the performance counter, you'll know whether the service is taking any request, or if your throttling configurations are necessary at all.
    5. The ultimate diagnostic tool: turned on service side WCF tracing? Open a trace file will definitely tell you what happened with a request. If you see any exception in the tracing file, you'll find your root cause.