Search code examples
asp.netperformancetcpiis-7.5threadpool

Detecting outbound connection queuing for ASP.NET website


Is there any way to detect when attempted outbound connections are queuing?

Our ASP.NET application makes a lot of outbound requests to other web services. Recently we ran across major performance issues, where calls to a particular endpoint were taking many seconds to complete or timing out. The owners of that service did not see any performance issues on their end. When we analyzed the network traffic, we saw that indeed, the HTTP requests were completing in a timely manner. That's when we figured out that our long wait times and timeouts were due to connection queuing.

Our first approach for fixing this was to simply increase the number of allowed outbound connections to that endpoint, thusly:

<system.net>
  <connectionManagement>
    <add address="http://some.endpoint.com" maxconnection="96" />
  </connectionManagement>
</system.net>

This did drop our calls to the endpoint drastically. However, we noticed that this caused our overall inbound requests to take much longer to complete. That's when we came across Microsoft KB 821268 . Following the "rule of thumb" guidelines there, we came up with these additional changes:

<processModel maxWorkerThreads="100" maxIoThreads="100" minWorkerThreads="50"/>
<httpRuntime minFreeThreads="704" minLocalRequestFreeThreads="608"/>      

This appeared to fix everything. Our calls to some.endpoint.com were still fast, and our response times dropped as well.

A few days later, however, it was brought to our attention that our site was performing poorly, and we saw some SQL Server timeouts. Our DBA did not see anything amiss in the performance of the server, so this looked like something similar happening all over again; we're wondering if the increased connections to some.endpoint.com is causing other outbound calls to queue, maybe due to insufficient threads.

The worst part about this, is we haven't found a good technique to definitively know whether outbound connection queuing is taking place. All we've been able to do is observe the time between when we make the request and receive a response in our application. It's hard to know whether timeouts and long response times are due to queuing specifically.

Are there any effective tools for measuring and tuning outbound request throttling? Any other performance tuning tips would definitely be appreciated as well.


Solution

  • The problem you are describinng touches many areas of diagnostics and I suppose there is no one simple tool that will allow you to say whether you suffer contention or not. From your description it looks like your depleting either connection or thread pools. This usually involves thread locking. Apart from the HttpWebRequest Average Queue Time performance counter pointed by @Simon Mourier (remember to set performancecounters="enabled" in your config file) there are few more to monitor. I would start with custom performance counters that will monitor thread pool usage in your ASP.NET application - unfortunately they are not included into framework counters but they are fairly simple to implement as shown here. Additionally I wrote a simple powershell script that will group for you thread states in your application. You may get it from here. It resembles a bit top command in Linux and will show you thread states or thread wait reasons for your processes. Have a look at 2 applications (both named Program.exe) screenshots:

    one suffering from contention

    > .\ThreadsTop.ps1 -ThreadStates -ProcMask Program
    
    Threads states / process
    
    Process Name    Initialized       Ready     Running     Standby  Terminated     Waiting  Transition     Unknown
    ------------    -----------       -----     -------     -------  ----------     -------  ----------     -------
    Program                   0           0           0           0           0          22           0           0
    

    and the number of waiting threads constantly growing

    > .\ThreadsTop.ps1 -ThreadWaitReasons -ProcMask Program
    
    Legend:
     0  - Waiting for a component of the Windows NT Executive| 1  - Waiting for a page to be freed
     2  - Waiting for a page to be mapped or copied          | 3  - Waiting for space to be allocated in the paged or nonpag
    ed pool
     4  - Waiting for an Execution Delay to be resolved      | 5  - Suspended
     6  - Waiting for a user request                         | 7  - Waiting for a component of the Windows NT Executive
     8  - Waiting for a page to be freed                     | 9  - Waiting for a page to be mapped or copied
     10 - Waiting for space to be allocated in the paged or nonpaged pool| 11 - Waiting for an Execution Delay to be resolve
    d
     12 - Suspended                                          | 13 - Waiting for a user request
     14 - Waiting for an event pair high                     | 15 - Waiting for an event pair low
     16 - Waiting for an LPC Receive notice                  | 17 - Waiting for an LPC Reply notice
     18 - Waiting for virtual memory to be allocated         | 19 - Waiting for a page to be written to disk
    
    Process Name      0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19
    ------------      -   -   -   -   -   -   -   -   -   -  --  --  --  --  --  --  --  --  --  --
    Program           1   0   0   0   0   0  34   0   0   0   0   0   0   0   0   3   0   0   0   0
    

    and other running normally:

    > .\ThreadsTop.ps1 -ThreadStates -ProcMask Program
    
    Threads states / process
    
    Process Name    Initialized       Ready     Running     Standby  Terminated     Waiting  Transition     Unknown
    ------------    -----------       -----     -------     -------  ----------     -------  ----------     -------
    Program                   0           1           6           0           0          20           0           0
    

    the number of waiting threads does not gets higher than 24.

    > .\ThreadsTop.ps1 -ThreadWaitReasons -ProcMask Program
    
    Legend:
     0  - Waiting for a component of the Windows NT Executive| 1  - Waiting for a page to be freed
     2  - Waiting for a page to be mapped or copied          | 3  - Waiting for space to be allocated in the paged or nonpag
    ed pool
     4  - Waiting for an Execution Delay to be resolved      | 5  - Suspended
     6  - Waiting for a user request                         | 7  - Waiting for a component of the Windows NT Executive
     8  - Waiting for a page to be freed                     | 9  - Waiting for a page to be mapped or copied
     10 - Waiting for space to be allocated in the paged or nonpaged pool| 11 - Waiting for an Execution Delay to be resolve
    d
     12 - Suspended                                          | 13 - Waiting for a user request
     14 - Waiting for an event pair high                     | 15 - Waiting for an event pair low
     16 - Waiting for an LPC Receive notice                  | 17 - Waiting for an LPC Reply notice
     18 - Waiting for virtual memory to be allocated         | 19 - Waiting for a page to be written to disk
    
    Process Name      0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19
    ------------      -   -   -   -   -   -   -   -   -   -  --  --  --  --  --  --  --  --  --  --
    Program           1   0   0   0   0   0  18   0   0   0   0   0   0   0   0   6   0   0   0   0
    

    Of course the number of threads will be much higher in your case but you should be able to observe some tendency in threads behavior in "calm times" and waiting queue peaks when you suffer from contention.

    You may freely modify my script so it will dump this data somewhere else than console (like database). Finally I would recommend running profiler such as Concurrency Visualizer that will give you some more insight into threads behavior in your application. Enabling system.net trace sources might also help although the number of events might be overwhelming so try to tune it accordingly.