Search code examples
c#mongodbazure-cosmosdbazure-rm-template

Exactly 50% of documents are deleted with Cosmos DB and shard key support


I'm using Cosmos DB, Mongo DB 3.6 API, and collection with auto-scale and shard key enabled. I'm using this ARM template: https://github.com/Azure/azure-quickstart-templates/blob/master/101-cosmosdb-mongodb-autoscale/azuredeploy.json.

I have a piece of code which cleans collections before app startup using C# driver. The reason I'm not using BulkWriteAsync is I don't want to overflow my throughput setting (which is currently 500 - 5000 RU)

foreach (var collectionName in Collections)
{
   var collection = database.GetCollection<BsonDocument>(collectionName);
   long count = await collection.CountDocumentsAsync(
       Builders<BsonDocument>.Filter.Empty, null, cancellationToken);
   long deleted = 0;
   while (deleted < count)
   {
       var nextBatchCount = (int)Math.Min(count - deleted, BatchSizeDelete);
       var batch = await collection
           .Aggregate()
           .Skip((int)deleted)
           .Limit(nextBatchCount)
           .Project(Builders<BsonDocument>.Projection.Include("_id"))
           .ToListAsync(cancellationToken);
       deleted += nextBatchCount;
       await collection.DeleteManyAsync(
           Builders<BsonDocument>.Filter.In("_id", batch.Select(x => x["_id"])), cancellationToken);
       Log.Information("Deleted {deleted} from {count} records", deleted, count);
       await Task.Delay(TimeSpan.FromSeconds(0.5), cancellationToken);
   }
}

And deployment template here:

 {
      "type": "Microsoft.DocumentDB/databaseAccounts/mongodbDatabases",
      "apiVersion": "2020-06-01-preview",
      "name": "[concat(parameters('account_name'), '/PatientRecords')]",
      "dependsOn": [],
      "properties": {
        "resource": {
          "id": "PatientRecords"
        },
        "options": {
          "autoscaleSettings": {
             "maxThroughput": "[parameters('autoscaleMaxThroughput')]"
          }
        }
      }
    },
    {
      "type": "Microsoft.DocumentDB/databaseAccounts/mongodbDatabases/collections",
      "apiVersion": "2020-06-01-preview",
      "name": "[concat(parameters('account_name'), '/PatientRecords/PaRecords')]",
      "dependsOn": [
        "[resourceId('Microsoft.DocumentDB/databaseAccounts/mongodbDatabases', parameters('account_name'), 'PatientRecords')]"
      ],
      "properties": {
        "resource": {
          "id": "PaRecords",
          "shardKey": {
            "ClinicId": "Hash"
          }
        },
        "options": {
          "throughput": 400
        }
      }
    },
    {
      "type": "Microsoft.DocumentDB/databaseAccounts/mongodbDatabases/collections",
      "apiVersion": "2020-06-01-preview",
      "name": "[concat(parameters('account_name'), '/PatientRecords/PatientRecords')]",
      "dependsOn": [
        "[resourceId('Microsoft.DocumentDB/databaseAccounts/mongodbDatabases', parameters('account_name'), 'PatientRecords')]"
      ],
      "properties": {
        "resource": {
          "id": "PatientRecords",
          "shardKey": {
            "ClinicId": "Hash"
          }
        },
        "options": {}
      }
    }

It worked until I enabled shardKey in ARM template, now for some reason exactly 50% of documents are deleted. For example, if there are 25000 documents in collection, only 12500 are deleted, and so on.

I tried also WriteBulkAsync, but it's all the same.

What can be a root of such strange behaviour or maybe my approach is wrong?


Solution

  • I was able to mitigate the issue by myself. It's not exactly clear what caused incomplete deletion, but the following code worked:

    // Fetch all data as '_id'
    var data = await collection
        .Aggregate()
        .Project(Builders<BsonDocument>.Projection.Include("_id"))
        .ToListAsync();
    if (data.Count > 0)
    {
        // Use bulk write and DeleteOneModel
        await collection.BulkWriteAsync(
            data.Select(x => new DeleteOneModel<BsonDocument>(Builders<BsonDocument>.Filter.Eq("_id", x["_id"]))),
            new BulkWriteOptions() { BypassDocumentValidation = true },
            cancellationToken
        );
        Log.Information("Deleted {Count} documents in {collectionName}", data.Count, collectionName);
    }