node.js mongodb synchronization distributed-computing race-condition

Dealing with race conditions and starvation when generating unique IDs using MongoDB + NodeJS

I am using MongoDB to generate unique IDs of this format:

{ID TYPE}{ZONE}{ALPHABET}{YY}{XXXX}

Here ID TYPE will be an alphabet from {U, E, V} depending on the input, zone will be from the set {N, S, E, W}, YY will be the last 2 digits of the current year and XXXXX will be a 5 digit number beginning from 0 (willbe padded with 0s to make it 5 digits long). When XXXXX reaches 99999, the ALPHABET part will be incremented to the next alphabet (starting from A).

I will receive ID TYPE and ZONE as input and will have to give the generated unique ID as output. Everytime, I have to generate a new ID, I will read the last generated for the given ID TYPE and ZONE, increment the number part by 1 (XXXXX + 1) and then save the new generated ID in MongoDB and return the output to the user.

This code will be run on a single NodeJS server and there can be multiple clients calling this method Is there a possibility of a race condition like the once described below if I am ony running a single server instance:

First client reads last generated ID as USA2100000
Second client reads last generated ID as USA2100000
First client generates the new ID and saves it as USA2100001
Second client generates the new ID and saves it as USA2100001

Since 2 clients have generated IDs, finally the DB should have had USA2100002.

To overcome this, I am using MongoDB transactions. My code in Typescript using Mongoose as ODM is something like this:

      session = await startSession();
      session.startTransaction();
      lastId = await GeneratedId.findOne({ key: idKeyStr }, "value").value
      lastId = createNextId(lastId); 
      const newIdObj: any = {
        key: `Type:${idPrefix}_Zone:${zone_letter}`,
        value: lastId,
      };
      await GeneratedId.findOneAndUpdate({ key: idKeyStr }, newIdObj, {
        upsert: true,
        new: true,
      });
      await session.commitTransaction();
      session.endSession();

I want to know what exactly will happen when the situation I described above happens with this code?
Will the second client's transaction throw an exception and I have to abort or retry the transaction in my code or will it handle the retry on its own?
How does MongoDB or other DBs handle transactions? Does MongoDB lock the documents involved in the transaction? Are the exclusive locks (wont even allow other clients to read)?
If the same client keeps failing to commit its transaction, this client would be starved. How to deal with this starvation?

Solution

You are using MongoDB to store the ID. It's a state. Generation of the ID is a function. You use Mongodb to generate the ID when mongodb process takes arguments of the function and returns the generated ID. It's not what you are doing. You are using nodejs to generate the ID.

Number of threads, or rather event loops is critical as it defines the architecture but in either way you don't need transactions. Transactions in mongodb are being called "multi-document transactions" exactly to highlight they are intended for consistent update of several documents at once. The very first paragraph of https://docs.mongodb.com/manual/core/transactions/ warns you that if you update a single document there is no room for transactions.

A single threaded application does not require any synchronisation. You can reliably read the latest generated ID on start and guarantee the ID is unique within the nodejs process. If you exclude mongodb and other I/O from the generation function you will make it synchronous so you can maintain state of the ID within nodejs process and guarantee its uniqueness. Once generated you can persist in in the db asynchronously. In the worst case scenario you may have a gap in the sequential numbers but no duplicates.

If there is a slighteest chance that you may need to scale up to more than 1 nodejs process to handle more simultaneous requests or add another host for redundancy in the future you will need to sync generation of the ID and you can employ Mongodb unique indexes for that. The function itself doesn't change much you still generate the ID as in a single-threaded architecture but add an extra step to save the ID to mongo. The document should have unique index on the ID field, so in case of concurrent updates one of the query will successfully add the document and another will fail with "E11000 duplicate key error". You catch such errors on nodejs side and repeat the function again picking the next number: