Search code examples
wcfmessage-queuemsmqnetmsmqbindingwcf-4

MSMQ + WCF - Immediately Move Messages to the Dead-Letter Queue


We have a WCF service that listens for messages on a queue (MSMQ). It sends a request to our web server (REST API), which returns an HTTP status code.

If the status code falls within the 400 range, we are throwing away the message. The idea is a 400 range error can never succeed (unauthorized, bad request, not found, etc.) and so we don't want keep retrying.

For all other errors (e.g., 500 - Internal Server Error), we have WCF configured to put the message on a "retry" queue. Messages on the retry queue get retried after a certain amount of time. The idea is that the server is temporarily down, so wait and try again.

The way WCF is set up, if we throw a FaultException in the service contract, it will automatically put the message on the retry queue.

When a message causes a 400 range error, we are just swallowing the error (we just log it). This prevents the retry mechanism from firing; however, it would be better to move the message to a dead-letter queue. This way we can react to the error by sending an email to the user and/or a system administrator.

Is there a way to immediately move these bad messages to a dead-letter queue?


Solution

  • First, I kept referring to the dead-letter queue. At the time when I posted this question, I was unaware that WCF/MSMQ automatically creates what's known as a poison sub-queue. Any message that can't be delivered in the configured number of times is put in the poison sub-queue.

    In my situation, I knew that some messages would never succeed, so I wanted to move the message out of the queue immediately.

    The solution was to create a second queue that I called "poison" (not to be confused with the poison sub-queue). My catch block would create an instance of a WCF client and forward the message to this poison queue. I could reuse the same client to post to both the original queue and the poison queue; I just had to create a separate client end-point in the configuration file for each.

    I had two separate ServiceHost instances running that read the queues. The ServiceHost for the original queue did the HTTP request and forwarded messages to the poison queue when unrecoverable errors occurred. The second ServiceHost would simply send out an email to record that a message was lost.

    There was also the issue of temporary errors that exceeded the maximum number of tries. WCF/MSMQ automatically creates a sub-queue called <myqueuename>;poison. You cannot directly write to a sub-queue via WCF, but you can read from it using a ServiceHost. Whenever messages end up in the poison sub-queue, I simply forward the message to the poison queue, with the exact same client I use in the original handler's catch block.

    I wanted the ability to include a stack trace in the error emails. Since I was reusing the same client and service contract for all of the handlers, I couldn't just pass along the stack trace as a string (unless I added it to all of my data contracts). Instead, I had the poison handler try to execute the code one more time, which would fail again and spit out the stack trace.

    This is what my message queues ended up looking like:

    MyQueue
        - Queue messages
        - Retry
        - Poison
    MyQueuePoison
        - Queue messages
    

    This approach is pretty convoluted. It was strange calling A WCF client from within a WCF service handler. It also meant setting up one more queue on the server and a ton of additional configuration sections for specifying which queue a client should forward messages to.