Search code examples
asp.net-mvcproduction-environment

ASP.NET MVC Application crashing randomly


I created a MVC 2 Application to work as a RSS Feeder and deliver News Contents for a number of applications to consume. Everything was doing fine until yesterday when suddenly my application started throwing a random error which doesn't tell me much of what's going on (or at least I don't understand it). This error only occurs in Production and cannot be reproduced in staging or my local machine.

Here's the stack trace:

Error executing child request for handler 'System.Web.Mvc.HttpHandlerUtil+ServerExecuteHttpHandlerWrapper'.errorPath:/Android/Edition/2011-11-22/P1 HostIP:##.##.##.## at System.Web.HttpServerUtility.ExecuteInternal(IHttpHandler handler, TextWriter writer, Boolean preserveForm, Boolean setPreviousPage, VirtualPath path, VirtualPath filePath, String physPath, Exception error, String queryStringOverride) at System.Web.HttpServerUtility.Execute(IHttpHandler handler, TextWriter writer, Boolean preserveForm, Boolean setPreviousPage) at System.Web.HttpServerUtility.Execute(IHttpHandler handler, TextWriter writer, Boolean preserveForm) at System.Web.HttpServerUtilityWrapper.Execute(IHttpHandler handler, TextWriter writer, Boolean preserveForm) at System.Web.Mvc.ViewPage.RenderView(ViewContext viewContext) at System.Web.Mvc.ViewResultBase.ExecuteResult(ControllerContext context) at System.Web.Mvc.ControllerActionInvoker.<>c_DisplayClass14.b_11() at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func1 continuation) at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func1 continuation) at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultWithFilters(ControllerContext controllerContext, IList1 filters, ActionResult actionResult) at System.Web.Mvc.ControllerActionInvoker.InvokeAction(ControllerContext controllerContext, String actionName) at System.Web.Mvc.Controller.ExecuteCore() at System.Web.Mvc.MvcHandler.<>c__DisplayClass8.b__4() at System.Web.Mvc.Async.AsyncResultWrapper.<>c__DisplayClass1.b__0() at System.Web.Mvc.Async.AsyncResultWrapper.<>c__DisplayClass81.b__7(IAsyncResult _) at System.Web.Mvc.Async.AsyncResultWrapper.WrappedAsyncResult`1.End() at System.Web.Mvc.MvcHandler.EndProcessRequest(IAsyncResult asyncResult) at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

This error only occurs sometimes and can throw at any action of any controller. I would like to have found the reason in a specific action but it only throws sometimes and can throw for any action.


Solution

  • Such byzantine errors are hard to debug. The stack trace you're seeing does not help in diagnosing the error.

    First, you should improve error logging. In your global.asax, create an implementation of the Application_Error hook that logs the exception and all inner exceptions to a file (I wouldn't log this to the database, because the db connection could be the culprit). Make sure that this code is solid: it should focus on these critical errors and not log every 404 page. Also, make sure that the logging code itself does not create any problems (it should be highly error tolerant).

    The cause for this kind of problems is usually access to some kind of static variables. Out of all keywords, I believe static is by far the most dangerous one because it is so subtle.

    Some common errors I've seen.

    • Caching Someone wanted to be smart and cache some data in a static dictionary or so. Unfortunately, the locking code is flawed. An exception occurs only if the code for some user tries to add sth. to the cache, but it's already there:

      if(_dict.ContainsKey(cacheKey) == false)
      {
         // second thread adds data to the dictionary here
         _dict.Add(cacheKey, cacheData); // exception
      }
      

      This can also happen in a 3rd party library that uses caching or pooling. Access static variables with care.

    • Uncommon Code Paths Something unusual happens that calls code that is not called very often, and that code is flawed. If you have testing with high code coverage, check the spots that are not covered by the tests.

    • DB Connection Lost A socket reset on the db connection could lead to exceptions. Many database connection libraries / drivers fix this quickly (i.e. the next request will be fine). This is hard to handle; basically it shouldn't happen but there are many reasons why it could occur.