Search code examples
azure-application-insights

Why is the timeout error untraceable on App Insights?


We have a request to gather insights as follows for timeouts from asp.net web applications hosted on Azure App Service

  • Date
  • Time (by at least hour)
  • API Endpoint

Ideally, this would also include a way to distinguish by:

  • Default API endpoint
  • API endpoint using filter

Basically the ultimate goal is to track down the root cause of numerous timeout alerts we keep getting.

Basically, we are able to identify some of the issues/alerts we get and trace them on the app insights, but for some reason, we are unable to trace timeouts on the app insights

Traceable alert:

alert1

Non-traceable alert:

alert2

I've considered Kusto queries, but I'm not sure what the formula would be to create metrics discussed above.

This is the full untraceable error for reference:

System.ComponentModel.Win32Exception: The wait operation timed out

Generated: Fri, 19 Nov 2021 13:30:41 GMT

System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
at System.Data.SqlClient.SqlDataReader.get_MetaData()
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
at Ideal.Infrastructure.Data.Dapper.SqlMapper.<QueryInternal>d__34`1.MoveNext() in C:\Projects\edgemvc-ideal\Ideal.Infrastructure.Data\Dapper\SqlMapper.cs:line 697
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
at Ideal.Reporting.DynamicDataService.GetDataWithFilterSingular(String procName, BaseFilterParams filterParams) in C:\Projects\edgemvc-ideal\Ideal.ReportService\DynamicDataService.cs:line 28
at Ideal.Reporting.ReportingService.PracticeBreakoutReportPro(String section, BaseFilterParams filters, String schema) in C:\Projects\edgemvc-ideal\Ideal.ReportService\ReportingService.cs:line 248
at Ideal.API.Controllers.ReportController.PracticeBreakoutSectionPro(FilterParams filters, String descriptionId, String section) in C:\Projects\edgemvc-ideal\Ideal.API\Controllers\ReportController\ReportController.cs:line 81
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass10.<GetExecutor>b__9(Object instance, Object[] methodParameters)
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ApiControllerActionInvoker.<InvokeActionAsyncCore>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ExceptionFilterResult.<ExecuteAsync>d__0.MoveNext()
ClientConnectionId:9963f70a-f829-4b28-9655-4e256156f4bd
Error Number:-2,State:0,Class:11

Edit

I have tried looking for the exception in Log Analytics per suggestion, but it's still not traceable:

LA


Solution

  • I'm also unable to find some of these timeouts in the Performance/Failure tabs under Application Insights for my particular App Service.

    However, after I implemented Application Insights SDK (Javascript) on my frontend:

    • I think I'm able to find them for each Page in my application, using Log Analytics:

    • pageViews
      | where duration > 25000 and client_Type == "Browser"
      | sort by timestamp, count_sum
      

      where 25000 equals 25 seconds, a threshold for our timeouts.

    • It seems that backend requests are traced automatically in the requests table, so no extra implementation on frontend required:

      requests
      | where duration > 25000
      | summarize count_sum = count() by bin(timestamp,1d), operation_Name, name
      | sort by timestamp, count_sum
      

    Disclaimer: I'm not 100% sure about the correctness, but at least it has helped me a bit. The documentation on Application Insights is a bit complex I feel.