Search code examples
.netwcftimeoutnettcpbinding

WCF NetTcpBinding timeout problems - strange behaviour


The binding I use is NetTcpBinding (over internet), with all the concurrency stuff and maxconnection set to high value (4000 maxconnection, concurrent, and so on), so the throttling is set really high, and in facts from the wcf performance counters of the server I can see that no one of the concurrency counter is full, BUT...but I'm experience a timeout nightmare describe as follow:

"A timeout which merely defines how long you have to wait for the service to actually fail and give you an error, but modifying the value of this timeout has no impact on the chance of success. Basically, something happens during the first second of the service request which mucks things up. It will never recover. WCF doesn't magically retry the network connection for you. Fine, sometimes establishing a network connection doesn't go well. But, if your timeout is 2 hours, you have to wait 2 whole hours with no chance of it ever working before it finally acknowledges that it didn't work and gives you the error."

I took the above description from this thread that is exactly the problem I'm getting.

The operators do the same operation thousand and thousand of times per day, and sometimes they take the timeout error, even if after they can do the same operation quickly without any latency, also in the meanwhile someone is getting the timeout, on the same computer with another instance of the client, they can make the call and it goes fast! And checking the errors on the server-side nothing is logged in the service implementation and nothing is stored on databse, so the methods has never been called by the request, so I'm absolutely thinking about some error during the initialization of the call, maybe the authentication, tokenize, cripting, or something in that layer, but I cannot find any bloody error message (maybe enabling the wcf tracing?).

Anyway the fact is that we got the same old software written in classic asp with normal http client-server request, and this problems never happens, also if we are talking about the same internet connection, so what I have to thinking? Internet explorer is making something magical when get some errors? WCF got some hiding configuration that i can set to improve this behavior? What i can say is that i've tried to configure our wcf service in WsHttpBinding, and this kind of timeouts seams to never happen, even if we still getting some errors, but the connection is immediately released, even if the connection is quite the same as the tcp connection because by default the wshttpbinding is configured to keepalive the tcp channel, but I still doesn't understand why with tcp I'm getting this kind of timeouts.

If someone can help me, I will really appreciate it!

Thank you!


Solution

  • The problem was related to a bad hardware, and it was really difficult to debug, also with wireshark (tcp sniffer) the packets didn't show any particular errors, we found some tcp-retries and this could have been a symptom, but actually the packets was simply stuck in somewhere inside the modem-router that was a telecom modem (pirelli gate 2 plus), after changed the modem/router, the problem completely disappear.

    Anyway we figured out that a wsHttpBinding over http, it's more reliable for an internet connection where you don't have control, and you cannot be sure on what hardware is installed on the site.

    Hope this can help also someone else :)