Search code examples
c#sql-servernservicebusmsmqmsdtc

Non-simultaneous commits in a distributed transaction involving EntityFramework/SQL Server and NServiceBus/MSMQ


There is a .NET 4.7 WebAPI application working with SQL Server using Entity Framework and hosting NServiceBus endpoint with MSMQ transport.

Simplified workflow can be described by a controller action:

[HttpPost]
public async Task<IHttpActionResult> SendDebugCommand()
{
    var sample = new Sample
                 {
                     State = SampleState.Initial,
                 };
    _dataContext.Set<Sample>().Add(sample);
    await _dataContext.SaveChangesAsync();

    sample.State = SampleState.Queueing;

    var options = new TransactionOptions
                  {
                      IsolationLevel = IsolationLevel.ReadCommitted,
                  };
    using (var scope = new TransactionScope(TransactionScopeOption.Required, options, TransactionScopeAsyncFlowOption.Enabled))
    {
        await _dataContext.SaveChangesAsync();    
        await _messageSession.Send(new DebugCommand {SampleId = sample.Id});
        scope.Complete();
    }

    _logger.OnCreated(sample);

    return Ok();
}

And DebugCommand handler, that is sent to the same NServiceBus endpoint:

public async Task Handle(DebugCommand message, IMessageHandlerContext context)
{
    var sample = await _dataContext.Set<Sample>().FindAsync(message.SampleId);

    if (sample == null)
    {
        _logger.OnNotFound(message.SampleId);
        return;
    }

    if (sample.State != SampleState.Queueing)
    {
        _logger.OnUnexpectedState(sample, SampleState.Queueing);
        return;
    }

    // Some work being done

    sample.State = SampleState.Processed;
    await _dataContext.SaveChangesAsync();

    _logger.OnHandled(sample);
}

Sometimes, message handler retrieves the Sample from the DB and its state is still Initial, not Queueing as expected. That means that distributed transaction initiated in the controller action is not yet fully complete. That is also confirmed by time-stamps in the log file.

The 'sometimes' happens quite rarely, under heavier load and network latency probably affects. Couldn't reproduce the problem with local DB, but easily with a remote DB.

I checked DTC configurations. I checked there is escalation to a distributed transaction for sure. Also if scope.Complete() is not called then there will be no DB update neither message sending happening.

When the transaction scope is completed and disposed, intuitively I expect both DB and MSMQ to be settled before a single further instruction is executed.

I couldn't find definite answers to questions:

  • Is this the way DTC work? Is this normal for both transaction parties to do commits, while completion is not reported back to the coordinator?
  • If yes, does it mean I should overcome such events by altering logic of the program?
  • Am I misusing transactions somehow? What would be the right way?

Solution

  • In addition to the comments mentioned by Evk in Distributed transaction with MSMQ and SQL Server but sometimes getting dirty reads here's also an excerpt from the particular documentation page about transactions:

    A distributed transaction between the queueing system and the persistent storage guarantees atomic commits but guarantees only eventual consistency.

    Two additional notes:

    • NServiceBus uses IsolationLevel.ReadCommitted by default for the transaction used to consume messages. This can be configured although I'm not sure whether setting it to serialized on the consumer would really solve the issue here.
    • In general, it's not advised to use a shared database between services as this highly increases coupling and opens the door for issues like you're experiencing here. Try to pass relevant data as part of the message and keep the database an internal storage for one service. Especially when using web servers, a common pattern is to add all the relevant data to a message and fire it while confirming success to the user (as the message won't be lost) while the receiving endpoint can store the data to it's database if necessary. To give more specific recommendations, this requires more knowledge about your domain and use case. I can recommend the particular discussion community to discuss design/architectural question like this.