Search code examples
mongodbmongodb-.net-driverchangestream

Can MongoDB Change Events be considered unique?


Description

I am utilizing the MongoDb change stream (C# MongoDB.Driver v2.12.0) to track changes on a single collection. In an experimental use case the collection stores information about execution of threads.

A thread has two properties:

  • Status - RUNNING, BLOCKED or COMPLETED
  • BlockedCount - number of blocking threads

During its execution, a thread can spawn children threads and be blocked until all of the children are not completed. Whenever a children thread completes its execution, it updates the database by decrementing the BlockedCount of the parent. Once the BlockedCount drops to 0, the parent thread should continue its execution.

Code for subscribing to change stream:

var pipeline = new EmptyPipelineDefinition<ChangeStreamDocument<T>>()
                    .Match(change => change.OperationType == ChangeStreamOperationType.Insert ||
                                     change.OperationType == ChangeStreamOperationType.Update ||
                                     change.OperationType == ChangeStreamOperationType.Replace)
                    .AppendStage<ChangeStreamDocument<T>, ChangeStreamDocument<T>, ChangeStreamOutputWrapper<T>>(
                                     "{ $project: { '_id': 1, 'fullDocument': 1, 'ns': 1, 'documentKey': 1 }}");

var options = new ChangeStreamOptions
{
    FullDocument = ChangeStreamFullDocumentOption.UpdateLookup
};

using (var cursor = await coll.WatchAsync(pipeline, options, cancellationToken))
{
    await cursor.ForEachAsync(async change =>
    {
        // await some handler routine
    }, cancellationToken);
}

Issue

What I have noticed is that the change events can be different even if the update operations are exactly the same. To better explain this, here is an example:

There is 1 parent thread and 3 children threads completing their execution, there are two different behaviors observed:

  • 3 distinct update events for the parent thread:

    • "Status" : "BLOCKED", "BlockedCount" : 2
    • "Status" : "BLOCKED", "BlockedCount" : 1
    • "Status" : "BLOCKED", "BlockedCount" : 0
  • 3 identical update events for the parent thread:

    • "Status" : "BLOCKED", "BlockedCount" : 0
    • "Status" : "BLOCKED", "BlockedCount" : 0
    • "Status" : "BLOCKED", "BlockedCount" : 0

Questions

  1. Is this considered a normal behavior?
  2. Is there some kind of configuration that would prevent this, and fire only the 'latest' update?

Solution

  • Yes, that is the expected behavior. The documentation (link) states, that:

    The fullDocument document represents the most current majority-committed version of the updated document. The fullDocument document may vary from the document at the time of the update operation depending on the number of interleaving majority-committed operations that occur between the update operation and the document lookup.

    And as far as I know, there's no way to overcome nor adjust this behavior. However, what you can do, is to read updateDescription directly, manually tracking the changes. It's not going to be complex if the BlockedCount is only being set (i.e., not removed and re-added later).