We recently upgraded to EMC AppXtender REST Services 8.1. When we installed this on a server, it created a Virtual Directory (AppXtenderRest).
We developed our web application by calling the REST Services available on this server.
When we were under development REST server never hanged. But once when we moved to production it started hanging. We are now re-setting IIS on this Server every 2-3 hours.
After some research we took following steps in our code.
async
/ await
Nothing is working.
Tried to check if any particular request is making server hang, but doesn't look like that. All requests return JSON except for one that returns Stream
(Tiff/Pdf).
Here is a sample of our REST Sevice call:
using (var client = CreateHttpClient())
{
using (var response = await client.DeleteAsync(string.Format(RestUrls.deletedoc, DataSource, AppId, docId), GetCancelToken()))
{
if (response.IsSuccessStatusCode)
{
result = await response.Content.ReadAsStringAsync();
}
else
{
result = await response.Content.ReadAsStringAsync();
throw new Exception(result);
}
}
}
Also attaching the worker process requests queue screenshot on the server that shows requests hanging after certain period of time (after 2-3 hours)
Also attaching the debug analysis report from the server taken just after it hanged.
https://drive.google.com/open?id=0Bx6jnZk4gj2Ycmw2M1RKM3RiTzg
As we are in production now, cannot afford frequent IIS resets.
TLDR - http client connection leak fix is good, but your first problem is blocking threads. plus you've just exposed sensitive data. Also always start with app pool recycle first instead of iisreset to avoid bringing down the whole server.
As mentioned above, you were leaking TCP connections by wrapping HTTPClient with using, but you have fixed that so that's not the main issue although still a scaling limiting item waiting to be hit next.
Plus if you were to exhaust all TCP ports, that would have been more obvious showing up with exceptions, not a hang.
Looking at the debugdiag analysis, your problem seems to be with sync SQL calls blocking 40% of the other threads. If you eventually have all worker threads busy waiting on other blocking threads, then the requests will get queued up producing a hang until the request queue is full and resulting in 503 service unavailable.
The following threads in w3wp.exe__AppXtender Rest Services__PID__12056__Date__03_28_2017__Time_09_58_36AM__83__Manual Dump.dmp are waiting to enter a .NET Lock
( 33 34 35 50 52 53 54 56 57 58 59 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 )
50.91% of threads blocked (56 threads)
And the thread they are reportedly waiting on is 55 which runs SqlCommand.ExecuteReader
There is an async version - ExecuteReaderAsync which you should change to ( or get the owner of that component to change )
Thread 55 - System ID 17820
Entry point clr!Thread::intermediateThreadProc
Create time 3/28/2017 9:51:46 AM
Time spent in user mode 0 Days 00:00:00.421
Time spent in kernel mode 0 Days 00:00:00.187
This thread is waiting on data to be returned from the database server
The current executing command is : SELECT cfgid, cfgvalue FROM ae_cfg WHERE cfgid = 34 and the command timeout is set to 0 seconds.
The connection string for this connection : *** and the connection timeout : 15 seconds.
.NET Call Stack
System_Data_ni!DomainNeutralILStubClass.IL_STUB_PInvoke(SNI_ConnWrapper*, SNI_Packet**, Int32)+84
[[InlinedCallFrame] (.SNIReadSyncOverAsync)] .SNIReadSyncOverAsync(SNI_ConnWrapper*, SNI_Packet**, Int32)
System_Data_ni!SNINativeMethodWrapper.SNIReadSyncOverAsync(System.Runtime.InteropServices.SafeHandle, IntPtr ByRef, Int32)+6a
System_Data_ni!System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()+83
System_Data_ni!System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()+7e
System_Data_ni!System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()+65
System_Data_ni!System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte ByRef)+2e
System_Data_ni!System.Data.SqlClient.TdsParser.TryRun(System.Data.SqlClient.RunBehavior, System.Data.SqlClient.SqlCommand, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.BulkCopySimpleResultSet, System.Data.SqlClient.TdsParserStateObject, Boolean ByRef)+292
System_Data_ni!System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()+5c
System_Data_ni!System.Data.SqlClient.SqlDataReader.get_MetaData()+66
System_Data_ni!System.Data.SqlClient.SqlCommand.FinishExecuteReader(System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.RunBehavior, System.String)+11d
System_Data_ni!System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(System.Data.CommandBehavior, System.Data.SqlClient.RunBehavior, Boolean, Boolean, Int32, System.Threading.Tasks.Task ByRef, Boolean, System.Data.SqlClient.SqlDataReader, Boolean)+ba0
System_Data_ni!System.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, System.Data.SqlClient.RunBehavior, Boolean, System.String, System.Threading.Tasks.TaskCompletionSource`1, Int32, System.Threading.Tasks.Task ByRef, Boolean)+22a
System_Data_ni!System.Data.SqlClient.SqlCommand.RunExecuteReader(System.Data.CommandBehavior, System.Data.SqlClient.RunBehavior, Boolean, System.String)+62
System_Data_ni!System.Data.SqlClient.SqlCommand.ExecuteReader(System.Data.CommandBehavior, System.String)+ca
XtenderSolutions.UtilityLibrary.General.DbCommon.GetStringTypeFromDB(XtenderSolutions.Administration.Database.DbCommonEx)+1aa
XtenderSolutions.UtilityLibrary.General.DbCommon.Open()+11c
XtenderSolutions.CMData.CMConnection.Open()+a7
XtenderSolutions.CMData.CMCfgMgr.Load(XtenderSolutions.CMData.CMConnection, Int16)+55
XtenderSolutions.CMData.CMConnection.InitEAIHooks()+4f
XtenderSolutions.CMData.CMConnection.Init(System.String)+595
XtenderSolutions.CMData.CMConnection..ctor(XtenderSolutions.CMData.CMSession, System.String)+17b
XtenderSolutions.CMData.CMSession.get_Connection()+7e
XtenderSolutions.CMData.CMSession.Login(XtenderSolutions.Configuration.DataSourceConfig, System.String, System.String, System.Security.Principal.WindowsIdentity, System.String, Boolean)+46e
Also, I would strongly recommend to remove your debugdiag share or at least remove sensitive data from it before sharing and change the account password.
Hint: Basic Auth headers -> base64 -> cleartext user:pwd
Lastly IISReset:
If you are not yet at the stage of http.sys request queue fill up, you can also try app pool recycling which gives you a new w3wp.exe worker process or even a pool stop/start as you really don't want to wait for current requests to keep hanging. Pool recycle is less intrusive then bringing the whole IIS server down. But as soon as you have many requests in http.sys queue you might end up needing to iisreset. I would always try to avoid iisreset especially if there are other sites / vdirs on that host... You can monitor the IIS perf counters and decide based on that