Search code examples
c#sql-servermongodbentity-framework-coretransactions

Cosmos DB Inserts All Data, but SQL Server Only Inserts Some Records Within a Transaction


I am working on a system that initially stored data in Azure SQL Server. However, due to limitations in the table design that made it unsuitable for analyzing a promotion activity, we decided to also store the data in a NoSQL database for better querying and analysis.

To ensure consistency, I use a transaction when saving data to Azure SQL Server and Azure Cosmos DB. My understanding is that both _dbContext.SaveChangesAsync and transaction.CommitAsync are atomic operations, so I expected all or none of the records to be stored. According to the official documentation, "the transaction will auto-rollback when disposed if either command fails." However, I noticed that:

  • All data is successfully inserted into Azure Cosmos DB.
  • Only some records appear in Azure SQL Server.

Additional Information:

  • Expected Behavior:
    • Both SQL Server and Cosmos DB should store the same number of records.
  • Things I Have Checked:
    • redPoints contains the expected number of elements before insertion.
    • No unique constraints, triggers, or other SQL Server table restrictions should be causing silent rejections.
    • We recently changed our logging system, so we lost past log data, making debugging more difficult.
    • We have not enabled the SQL retry strategy.
  • Tools:
    • NET: 6.0
    • Microsoft.EntityFrameworkCore: 7.0.7
    • MongoDB.Drive: 2.22.0

Questions:

  • What could cause SQL Server to insert only some of the records while Cosmos DB successfully inserts all of them?
  • Is there a potential issue with SaveChangesAsync or transaction handling?
  • How can I debug this issue further?

Here is the simplfied code:

private async Task SaveRedPointsToDb(
    List<RedPoint> redPoints,
    CancellationToken cancellationToken
)
{
    await using var transaction = await _dbContext.Database.BeginTransactionAsync(
        cancellationToken
    );
    try
    {
        // Azure SQL Server
        await _dbContext.RedPoints.AddRangeAsync(redPoints, cancellationToken);
        await _dbContext.SaveChangesAsync(cancellationToken);

        // Azure Cosmos DB
        var redPointDocuments = redPoints.Select(x => new RedPointDocument(x)).toList();
        await _redPointCollection.InsertManyAsync(redPointDocuments);

        await transaction.CommitAsync(cancellationToken);
    }
    catch (Exception e)
    {
        // Log...
    }
}

Solution

  • The code looks ok. As written a commit failure on SQL would cause that. In SQL Server commit failures are rare, and typically only happen when you have an IO issue like running out of space on the transaction log, or are using in-memory OLTP which does commit-time conflict resolution. Your logging should have caught that.

    Another possible cause is a crash in the client program after writing to Cosmos but before committing in SQL server.

    Also the rows may have been inserted, and then deleted in SQL Server.

    In general I would recommend abandoning the dual-write pattern, in favor of an eventually-consistent "write-after" pattern. EG write to SQL and include a flag on the row, or a row in a "queue table" indicating that the row needs to be replicated to Cosmos. Then have a background process that copies all such rows and updates the flag or removes the rows from the queue table.