Search code examples
.netsql-serverentity-frameworktransactionsunit-of-work

Handling large-scale updates and inserts in Entity Framework transactions


I have a .NET Framework application using Entity Framework and the unit of work pattern. The database table contains over 3 million rows.

In one function, I need to update almost all these rows and perform multiple table inserts within a single transaction, as shown below. However, when unitOfWork.Commit(transaction); is executed, the process runs indefinitely, and the progress indicator keeps loading without completion.

using (var unitOfWork = UnitOfWorkFactory.Create())
{
      using (var transaction = unitOfWork.DataStore.Database.BeginTransaction())
      {
            try
            {
                  SomeDBUpdateFunction1(); // Update 3M rows
                  callback.UpdateClientProgress(new SimpleProgressInfo { Status = "Saving items to the database..." });
                  SomeDBInsertFunction1(); // Insert 30K rows
                  results = SomeDBInsertFunction2(); // Insert 30K rows
                  FileGenerateFunction() // Generate an output file for this process
                  unitOfWork.Commit(transaction);
            }
            catch (Exception)
            {
                  transaction.Rollback();
                  throw;
            }
      }
}

I need a solution to resolve this issue.

I have attempted batch processing, but since each batch needs to be committed separately, there is no way to roll back already committed batches if a failure occurs mid-process.

Currently, all the DB changes are using LINQ and EF. Would it be beneficial to use SqlBulkCopy() in combination with temporary tables and ExecuteSqlCommand() inside the transaction? Would this help in improving performance while maintaining atomicity?


Solution

  • Since SqlBulkCopy() is optimized for bulk inserts, this makes it significantly faster than using EF for large datasets. Using temporary tables for bulk inserts and then merging them with ExecuteSqlCommand() helps maintain performance and reduce the overhead. The entire process remains within a single transaction, so if any step fails, all changes are rolled back, ensuring consistency.

    using (var unitOfWork = UnitOfWorkFactory.Create())
    {
        using (var transaction = unitOfWork.DataStore.Database.BeginTransaction())
        {
            try
            {
                SomeDBUpdateFunction1(); 
                using (var bulkCopy = new SqlBulkCopy(unitOfWork.DataStore.Database.Connection))
                {
                    bulkCopy.DestinationTableName = "#TempTable";
                    bulkCopy.WriteToServer(dataTable);  // Bulk insert
                }
                unitOfWork.DataStore.Database.ExecuteSqlCommand($"UPDATE mt SET mt.ColumName = t.ColumName FROM MainTable mt JOIN #TempTable t ON mt.ID = t.ID");
                unitOfWork.Commit(transaction); 
            }
            catch (Exception)
            {
                transaction.Rollback(); 
                throw;
            }
        }
    }