Search code examples
socketswcfserviceconnectionnet-tcp

WCF service Net TCP socket connection get aborted after 15 minutes, always and regardless of binding configuration


I have a WCF service (not hosted through IIS) and I have another application that communicates with my WCF through net TCP. Everything works great except for long running operations. My WCF may perform tasks such as running queries or backup databases, etc. These tasks may take several minutes or hours to complete. It is also not an option to perform these tasks asynchronously.

Problem: at the 15 minutes mark, the socket connection gets aborted and I can't seem to figure out where this is coming from...

Here is the configuration of the WCF service:

<system.serviceModel>
<bindings>
    <netTcpBinding>
        <binding name="NetTcpBindingConfigurationForMyWCF"
                 openTimeout="24:00:00"
                 closeTimeout="24:00:00"
                 sendTimeout="24:00:00"
                 receiveTimeout="24:00:00"
                 maxReceivedMessageSize="9223372036854775807"
                 transferMode="Streamed">
            <readerQuotas maxDepth="2147483647" maxStringContentLength="2147483647" />
        </binding>
    </netTcpBinding>
</bindings>

<services>
    <service name="Namespace.MyWCF">
        <endpoint address=""
                  binding="netTcpBinding"
                  bindingConfiguration="NetTcpBindingConfigurationForMyWCF"
                  name="NetTcpBindingEndpointMyWCF"
                  contract="MyWCF.IMyWCF" />
        <endpoint address="mex" 
                  binding="mexTcpBinding" 
                  bindingConfiguration="" 
                  name="MexTcpBindingEndpoint" 
                  contract="IMetadataExchange" />
        <host>
            <baseAddresses>
                <add baseAddress="net.tcp://SQLServer/MyWCF" />
            </baseAddresses>
        </host>
    </service>
</services>
<behaviors>
    <serviceBehaviors>
        <behavior name="">
            <dataContractSerializer maxItemsInObjectGraph="2147483646"/>
            <serviceThrottling maxConcurrentCalls="1000" maxConcurrentInstances="1000" maxConcurrentSessions="1000"/>
            <serviceMetadata httpGetEnabled="false" httpsGetEnabled="false" />
            <serviceDebug includeExceptionDetailInFaults="true" />
        </behavior>
    </serviceBehaviors>
</behaviors>

And this is the configuration of my application:

<system.serviceModel>
<bindings>
    <netTcpBinding>
        <binding name="NetTcpBindingConfigurationForMyWCF"
                 openTimeout="24:00:00"
                 closeTimeout="24:00:00"
                 sendTimeout="24:00:00"
                 receiveTimeout="24:00:00"
                 maxReceivedMessageSize="9223372036854775807"
                 transferMode="Streamed">
            <readerQuotas maxDepth="2147483647" maxStringContentLength="2147483647" />
        </binding>
    </netTcpBinding>
</bindings>
<client>
    <endpoint address="net.tcp://SQLServer/MyWCF"
              binding="netTcpBinding"
              bindingConfiguration="NetTcpBindingConfigurationForMyWCF"
              contract="MyWCF.IMyWCF"
              name="NetTcpBindingEndpointMyWCF">
    </endpoint>
</client>

Here's what I desperately tried so far, without any luck:
- Set all timeout values to 24 hours
- Added config for serviceThrottling with large values for maxConcurrentCalls, maxConcurrentInstances & maxConcurrentSessions
- Disabled firewall on both servers
- Enabled nettcpbinding port sharing
- Enabled reliable session with inactivityTimeout = 24h

No matter what I try I keep getting the following error after 15 minutes:

Exception msg: An existing connection was forcibly closed by the remote host
Exception type: System.Net.Sockets.SocketException
Stack trace:    at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
   at System.ServiceModel.Channels.SocketConnection.ReadCore(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout, Boolean closing)


Exception msg: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '23.23:59:59.9843758'.
Exception type: System.ServiceModel.CommunicationException
Stack trace:    at System.ServiceModel.Channels.SocketConnection.ReadCore(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout, Boolean closing)
   at System.ServiceModel.Channels.SocketConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
   at System.ServiceModel.Channels.DelegatingConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
   at System.ServiceModel.Channels.ConnectionStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count)
   at System.Net.Security.NegotiateStream.StartFrameHeader(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.NegotiateStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)

Solution

  • I eventually found the root cause of this timeout and will answer my own question in case other people encounter the same issue.

    All communications between our servers go through a firewall. We use SonicWall which has a configuration for TCP inactivity timeout... default value is 15 minutes which is where my timeout was coming from.

    I ended up implementing a mechanism based on handles for long running operations. My calling application would request an operation and receives a handle in return. The application can then use the handle to get updates on the status of the operation by regularly calling the WCF, until the operation is complete.

    Problem solved!