Search code examples
node.jsqueueamazon-dynamodbrace-condition

preventing race conditions with nodejs


I'm writing an application using nodeJS 6.3.0 and aws DynamoDB.

the dynamodb holds statistics information that are added to dynamodb that are being called from 10 different function (10 different statistic measures). the interval is set to 10 seconds, which means that every 10 seconds, 10 calls to my function are being made to add all the relevant information.

the putItem function:

function putItem(tableName,itemData,callback) {
var params = {
    TableName: tableName,
    Item: itemData
};
docClient.put(params, function(err, data) {
    if (err) {
        logger.error(params,"putItem failed in dynamodb");
        callback(err,null);
    } else {
        callback(null,data);
    }
});

now... I created a queue.

var queue = require('./dynamoDbQueue').queue;

that implements a simple queue with fixed size that I took from http://www.bennadel.com/blog/2308-creating-a-fixed-length-queue-in-javascript-using-arrays.htm.

the idea is that if there is a network problem.. lets say for a minute. i want all the events to be pushed to the queue and when the problem is resolved to send queue information to dynamodb and to free the queue.

so I modified my original function to the following code:

function putItem(tableName,itemData,callback) {
var params = {
    TableName: tableName,
    Item: itemData
};
if (queue.length>0) {
    queue.push(params);
    callback(null,null);
} else {
    docClient.put(params, function (err, data) {
        if (err) {
            queue.push(params);
            logger.error(params, "putItem failed in dynamodb");
            handleErroredQueue(); // imaginary function that i need to implement
            callback(err, null);
        } else {
            callback(null, data);
        }
    });
}
}

but since I have 10 insert functions that runs at the same second, there is a chance of race conditions. which means that ...

execute1 - one function validated that the queue is empty... and is about to execute docClient.put() function.

execute2 - and at the same time another function returned from docClient.put() with an error and as a result it adds to the queue it's first row.

execute1 - by the time that the first function calling docClient.put(), the problem has been resolved and it successfully inserted data to dynamodb, which leaves the queue with previous data that will be released in the next iteration.

so for example if i inserted 4 rows with ids 1,2,3,4, the order of rows that will be inserted to dynamodb is 1,2,4,3.

is there a way to resolve that ?

thanks!


Solution

  • I think you are on right track, but instead of checking for an error and then adding into queue what I would suggest is to add every operation to queue first and then read the data from the queue every time.

    For instance, in your case you call function 1,2,3,4 and it results in 1,2,4,3 because you are using the queue at a time off error/abrupt operation.

    Step1: All your function will make an entry to a Queue -> 1,2,3,4
    Step2: Read your queue and make an insert, if success remove the element
           else redo the operation. This way it will insert in the desired sequence
    

    Another advantage is that because you are using queue you don't have to keep very high throughputs for the table.

    Edit:

    I guess you just need to ensure that on completion of your first operation you will perform your next process and not before that.

    e.g: fn 1 -> read from queue (don't delete right now from queue) -> operation Completed if not perfrom again -> Delete from queue -> perform next operation.

    You just have to make sure you read from queue and wait till you get response from DynamoDB.

    Hope this helps.