I am working on a system that initially stored data in Azure SQL Server. However, due to limitations in the table design that made it unsuitable for analyzing a promotion activity, we decided to also store the data in a NoSQL database for better querying and analysis.
To ensure consistency, I use a transaction when saving data to Azure SQL Server and Azure Cosmos DB. My understanding is that both _dbContext.SaveChangesAsync
and transaction.CommitAsync
are atomic operations, so I expected all or none of the records to be stored. According to the official documentation, "the transaction will auto-rollback when disposed if either command fails." However, I noticed that:
Additional Information:
redPoints
contains the expected number of elements before insertion.Questions:
SaveChangesAsync
or transaction handling?Here is the simplfied code:
private async Task SaveRedPointsToDb(
List<RedPoint> redPoints,
CancellationToken cancellationToken
)
{
await using var transaction = await _dbContext.Database.BeginTransactionAsync(
cancellationToken
);
try
{
// Azure SQL Server
await _dbContext.RedPoints.AddRangeAsync(redPoints, cancellationToken);
await _dbContext.SaveChangesAsync(cancellationToken);
// Azure Cosmos DB
var redPointDocuments = redPoints.Select(x => new RedPointDocument(x)).toList();
await _redPointCollection.InsertManyAsync(redPointDocuments);
await transaction.CommitAsync(cancellationToken);
}
catch (Exception e)
{
// Log...
}
}
The code looks ok. As written a commit failure on SQL would cause that. In SQL Server commit failures are rare, and typically only happen when you have an IO issue like running out of space on the transaction log, or are using in-memory OLTP which does commit-time conflict resolution. Your logging should have caught that.
Another possible cause is a crash in the client program after writing to Cosmos but before committing in SQL server.
Also the rows may have been inserted, and then deleted in SQL Server.
In general I would recommend abandoning the dual-write pattern, in favor of an eventually-consistent "write-after" pattern. EG write to SQL and include a flag on the row, or a row in a "queue table" indicating that the row needs to be replicated to Cosmos. Then have a background process that copies all such rows and updates the flag or removes the rows from the queue table.