Search code examples
c#azure-table-storageazure-functions

Getting "Entity already exists" error writing aggregates to Azure Table Storage (with Azure Function)


In an Azure Function, I'm trying to aggregate some data and write the aggregations into a Table.

I have a query that summarises data:

var query = recs
            .GroupBy(r => new { r.Category, r.Account, r.Record })
            .Select(r => new ts_webhitaggregate_account
                    {
                        PartitionKey = partition,  // Constant
                        RowKey = $"{r.Key.Category}:{r.Key.Account}:{r.Key.Record}", // Group By 
                        rawDate = intervaldate,   // Constant
                        epochDate = intervalepoch, // Constant
                        Category = r.Key.Category, // Group By 
                        Account = r.Key.Account, // Group By 
                        Record = r.Key.Record, // Group By 
                        Hits = r.Count(), // Aggregate
                        Users = r.Select(t => t.User).Distinct().Count(), // Aggregate
                        Devices = r.Select(t => t.Device).Distinct().Count() // Aggregate
                    });

I then attempt to pass these records to the ICollector bound Table

foreach (ts_webhitaggregate_account a in query.ToList())
{
    webhitsAggAccount.Add(a);
}

I'm frequently encountering an "Entity already exists" error like:

Exception has been thrown by the target of an invocation. Microsoft.WindowsAzure.Storage: 82:The specified entity already exists.

If I was writing a comparable SQL statement to the C# I wouldn't expect duplicates of the compound key to be possible as every value being written is the key, a constant, or an aggregation. I also have no pre-existing data in the Table which could be causing the conflict.

What am I doing wrong to be generating duplicates in my query?


Solution

  • I believe I found where I was being stupid ... these happen inside a loop for the category, and I should have been selecting a single category each time, but one range selection on the row key was including another category that then went on to be selected, twice inserted to the second category.

    When in doubt, print everything to the console!