Search code examples
c#asp.netasp.net-mvcelmahelmah.mvc

ELMAH exceptions generating generic "The service is unavailable" message


I'm trying to create an availability page which checks all the services that a site uses, wrapping each check in a try/catch and then displaying any failures to the users. One of those services is ELMAH, so I am calling that to double check that we can log errors there successfully.

Controller:

var a = new AvailabilityModel();

try {
    a.ElmahConnectionString = ConfigurationManager.ConnectionStrings["elmah-sqlserver"].ConnectionString;
    Elmah.ErrorSignal.FromCurrentContext().Raise(new Exception("Elmah availability test"));
    a.ElmahSuccess = true;

} catch (Exception ex) {
    a.ElmahSuccess = false;
    a.ElmahException = ex;
    Response.StatusCode = 503;
}

return View(a);

When ELMAH succeeds, all is well. When it throws any kind of error (DB permissions, etc.), I get an error which is not captured by the try/catch OR by any of the normal error-capturing pieces: ASP.NET MVC HandleError, customErrors redirects, or even httpErrors in system.webServer. The display is not the normal IIS generic message, instead I see a single line saying "The service is unavailable."

Response:

LTSB-W34511 C:\s\d\build % curl -i http://server/test/availability
HTTP/1.1 503 Service Unavailable
Cache-Control: public, max-age=14400, s-maxage=0
Content-Type: text/html
Server: Microsoft-IIS/7.5 X-AspNetMvc-Version: 4.0
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Wed, 06 Aug 2014 15:46:55 GMT
Content-Length: 27

The service is unavailable.

And that's it. At least I know my availability is not working, but I want to at least display to the user that it's ELMAH causing the problem, and show the connection string it's trying to use. So, I need to capture this exception somehow.

I've tried tweaking my web.config a number of different ways, but I suspect there's something about the way ELMAH inserts itself into the module pipeline which stops me from handling the issue.

Edit:

To clarify, this is a simplified example. I am not planning to expose this information to end users. This availability page will only be available to internal users who are troubleshooting future issues.

ELMAH is only one of the services/databases used by the application in question, and I want to give administrators a quick dashboard-like view of what is up and down. I can't do that if ELMAH errors lead to this insta-503.


Solution

  • Ok, basically this is not possible without any code. The Raise method in Elmah will not let you see any error except if you trace it:

    // ErrorLogModule.LogException
    try
    {
        Error error = new Error(e, context);
        ErrorLog errorLog = this.GetErrorLog(context);
        error.ApplicationName = errorLog.ApplicationName;
        string id = errorLog.Log(error);
        errorLogEntry = new ErrorLogEntry(errorLog, id, error);
    }
    catch (Exception value)
    {
        Trace.WriteLine(value);
    }
    

    However when the event is successfully logged the ErrorLogModule will call the logged event in order to let potential listeners know that the logging was a success. So let's quickly write a custom class that will override some methods from the ErrorLogModule and will allow us to notice that the event was not logged:

    public class CustomErrorLogModule: Elmah.ErrorLogModule
    {
        public Boolean SomethingWasLogged { get; set; }
        protected override void OnLogged(Elmah.ErrorLoggedEventArgs args)
        {
            SomethingWasLogged = true;
            base.OnLogged(args);
        }
    
        protected override void LogException(Exception e, HttpContext context)
        {
            SomethingWasLogged = false;
            base.LogException(e, context);
            if (!SomethingWasLogged)
            {
                throw new InvalidOperationException("An error was not logged");
            }
        }
    }
    

    Swap the ErrorLogModule with the CustomErrorLogModule in your configuration file and Elmah will complain when something wrong is happening; calling Elmah.ErrorSignal.FromCurrentContext().Raise(new Exception("test")); in a test page lets the InvalidOperationException("An error was not logged") be thrown out of the call.


    If you want to get back the exact exception that occured when trying to log the exception, you can use the fact that the ErrorLogModule traces the exception when it occurs. Create a listener class:

    public class ExceptionInterceptor : DefaultTraceListener
    {
        public Exception TracedException { get; set; }
        public override void WriteLine(object o)
        {
            var exception = o as Exception;
            if (exception != null)
            {
                TracedException = exception;
            }
        }
    }
    

    Then your LogException method becomes

    protected override void LogException(Exception e, HttpContext context)
    {
        var exceptionListener = new ExceptionInterceptor();
        Trace.Listeners.Add(exceptionListener);
        try
        {
            SomethingWasLogged = false;
            base.LogException(e, context);
            if (!SomethingWasLogged)
            {
                throw exceptionListener.TracedException;
            }
        }
        finally
        {
            Trace.Listeners.Remove(exceptionListener);
        }
    }
    

    EDIT: or even if you want to be as terse as possible

    public class ExceptionInterceptor : DefaultTraceListener
    {
        public override void WriteLine(object o)
        {
            var exception = o as Exception;
            if (exception != null)
            {
                throw exception;
            }
        }
    }
    
    // snip... LogException in your CustomErrorLogModule
    protected override void LogException(Exception e, HttpContext context)
    {
        var exceptionListener = new ExceptionInterceptor();
        Trace.Listeners.Add(exceptionListener);
        try
        {
            base.LogException(e, context);
        }
        finally
        {
            Trace.Listeners.Remove(exceptionListener);
        }
    }
    

    One final word: There is a smell in this way of checking for service availability, and you are going to pepper your error database with test exceptions which may not be the desired behavior. I understand that you aim to check the whole logging chain but perhaps there could be some other way to do it; I don't really know your context so I won't comment any further but don't hesitate to think on it.

    Anyway, these changes should let you receive the exception you will need.


    important edit: very important point: you may want to add a trigger to your CustomErrorLogModule so it doesn't throw when you are not testing. The resilience you are observing in Elmah is generally a good thing because you don't want a diagnostic platform to cause problems that may necessitate other diagnostics. That's why Elmah or logging frameworks don't throw, and that's why you should make the exception rethrowing mechanism triggerable so your program doesn't have to watch its step when raising exceptions in Elmah.