Search code examples
c#azureservicebusidempotent

Handling service bus Message.Complete() exceptions


Consider the scenario, an Azure service bus with message deduplication enabled, with a single topic, with a single subscription and an application that is subscribed to that queue.

How can I ensure that the application receives messages from the queue once and only once ?

Here is the code I'm using in my application to receive messages :

public abstract class ServiceBusListener<T> : IServiceBusListener
{
    private SubscriptionClient subscriptionClient;
    // ..... snip

    private void ReceiveMessages()
    {
        message = this.subscriptionClient.Receive(TimeSpan.FromSeconds(5));

        if (message != null)
        {
            T payload = message.GetBody<T>(message);                                    

            try
            {
                DoWork(payload);

                message.Complete();
            }
            catch (Exception exception)
            {
                // message.Complete failed
            }
        }
    }
}

The problem I forsee is that if message.Complete() fails for whatever reason, then that message that has just been processed will remain on the subscription's queue in Azure. When ReceiveMessages() is called again it will pick up that same message from the queue and the application would do the same work again.

Whilst the best solution would be to have idempotent domain logic (DoWork(payload)), this would be very difficult to write in this instance.

The only method I can see to ensure once and only once delivery to an application is by building another queue to act as an intermediary between the Azure service bus and the application. I believe this is called a 'Durable client-side queue'.

However I can see that this would be a potential issue for a lot of applications that use Azure service bus, so is a durable client-side queue the only solution ?


Solution

  • I have similar challenges in a large scale Azure platform I am responsible for. I use a logical combination of the concepts embodied by the Compensating Transaction pattern (https://msdn.microsoft.com/en-us/library/dn589804.aspx), and Event sourcing Pattern (https://msdn.microsoft.com/en-us/library/dn589792.aspx). Exactly how you incorporate these concepts will vary, but ultimately, you may need to plan on your own "rollback" logic, or detecting that a previous process completed 100% successfully minus the removal of the message. If there is something you could check upfront, you will know that a message was simply not removed, then complete it and move on. How expensive that "check" is may make this a bad idea. You can even "create" an artificial final step, like adding a row to a DB, that runs only when the DoWork reaches the end. You can then check for that row before processing any other messages.

    IMO, the best approach is to make sure that all of the steps in your DoWork() check for the existence of the work as having already been performed (if possible). For example, if it's creating a DB table, run a "IF NOT EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA...". In that scenario, even in the unlikely event this happens, it's safe to process the message again.

    Other approaches I use are to store the MessageID's (the sequential bigint on each message) of the previous X messages (i.e. 10,000), and then check for their existence (NOT IN) before I proceed with processing a message. Not as expensive as you might think and very safe. If found, simply Complete() the message and move on. In other situations, I update the message with a "starting" type status (inline in certain queue types, persisted elsewhere in others), then proceed. If you read a message and this is already set to "started", you know something either failed or did not clear appropriately.

    Sorry this is not a clear cut answer, but there are a lot of considerations.

    Kindest regards...