Search code examples
c#.netazureazure-table-storage

Azure table storage, help me clarify what's happening behind the scenes in batch operation


I have method that submits batch transaction to table storage (Nuget: Azure.Data.Tables - 12.6.1).
Code below:

private static async Task BatchManipulateEntities<T>(TableClient tableClient, IEnumerable<T> entities, TableTransactionActionType tableTransactionActionType, int batchSize) where T : class, ITableEntity, new()
{
    var groups = entities.GroupBy(x => new { x.PartitionKey });
    foreach (var group in groups)
    {
        var items = group.AsEnumerable();
        while (items.Any())
        {
            var batch = items.Take(batchSize);
            items = items.Skip(batchSize);

            var actions = batch.Select(e => new TableTransactionAction(tableTransactionActionType, e)).ToList();
            await tableClient.SubmitTransactionAsync(actions); // <-- Will this count as one batch write operation?
        }
    }
}

This will call SubmitTransactionAsync with up to hundred TableTransactionActions. But will the submitted batch transaction count as one "batch write operation behind the scenes or will it actually be 100 different ones?

Batch write operation is three times more costly than normal write operation, but if behind the scenes hundred entities will be uploaded as one batch write operation than I'm a happy man ;)
Azure Table Storage Pricing

Really would appreciate if somebody smarter can clarify this!


Solution

  • You can read more about it from the docs page. https://learn.microsoft.com/en-us/rest/api/storageservices/performing-entity-group-transactions

    From the docs:

    Operations within a change set are processed atomically; that is, all operations in the change set either succeed or fail. Operations are processed in the order they are specified in the change set.

    Because you can only update one partition at a time, I am assuming they do something to make it less than 100 different operations, but they don't specify how it is implemented on the backend. The important thing is you can treat it as one operation and know that all operations either succeeded or failed; you won't end up with a partially applied transaction.