Search code examples
node.jsmongodbbluebirdcoroutine

MongoDB - two updates in sequence overlap each other


We are building size calculation mechanism for our system. In order to calculate sizes, we start with the first atomic operation - findAndModify - to find the object and add lock properties to it (to prevent another calculations for this object to interact with it and wait till the end, as we could have many parallel calculations - in this case others should be postponed), then we calculate size of specific properties and after this operation - we add metadata to object and delete locks. However, it seems that sometimes, when we have a lot of multiple calculations for single object (especially when we calculate a lot of objects in parallel), some updates aren't executed.

_size metadata during calculation looks like this:

{
  _lockedAt: SomeDate,
  _transactionId: 'abc'
}

And after calculation it should look like this:

{
  somePropertySize: 123,
  anotherPropertySize: 1245,
  (...)
  _total: 131431523 // Some number
  // Notice that both _lockedAt and _transactionId should be missing
}

And this is how our update flow looks like:

return Promise.coroutine(function * () {

    yield object.findOneAndUpdate({
        '_id': gemId,
        '_size._lockedAt': {
          $exists: false
        }
      }, {
        $set: {
          '_size._lockedAt': moment.utc().toDate(),
          '_size._transactionId': transactionId
        }
      }).then(results => results.value);

      // Calculations are performed here, new _size object is built

    yield object.findOneAndUpdate({
      _id: gemId,
      _lockedAt: {
        $exists: true // We tried both with and without this property, does not change anything
      }
    }, {
      $set: {
        _size: newSizeObject
      }
    });

})()

Exemplary real-life object JUST before second update (truncated for brevity):

{ 
  title: 11, 
  description: 2, 
  detailedSection: 0, 
  tags: 2
  file: 5625898,
  _total: 5625913 
}

For some reason, when we have multiple calculations next to each other, sometimes (for new objects, without _size property at all), the objects stay with _size object looking exactly as after locking, despite the fact logs show us that everything went well (calculations were complete, new sizes object was calculated and second DB update was called).

We use MongoDB 3.0, two replicaSets. Any ideas on what is happening?


Solution

  • All in all, I checked the code very carefully and what was happening in reality, was the fact that completely different part of the code was querying the object from the DB and then, after a few other operations (mine included), it wrote the object to the DB (hence, overwriting my changes).

    So, important note for every MongoDB user - please do remember that MongoDB is not transactional, but still atomic, which means that it guarantees that your operation will be persisted, but does not guarantee that data between operations will be persisted.

    To sum up, things I learned by this example:

    • NEVER update whole object in the database with the data obtained from it some time before (e.g. by querying, changing some properties and saving again)
    • USE $set, $inc, $unset and other special operators. If you have a lot of parameters, use e.g. mongo-dot-notation npm library to flatten your data into $set selector.
    • If something unexpected is happening with your data (e.g. missing properties after saving) the first thing to investigate is another pending operations on those specific entities
    • The least probable cause of your problems is MongoDB itself. It's usually code that does not follow atomicity rules (which happens probably with a lot of people used to transactional DBs :)).