We have a node application which uses scheduled jobs, My scheduled job is working 6 times in a day. Now we are going to upgrade the node application into cluster based, I mean master and worker nodes. By doing that i see that jobs are duplicated and executed multiple times.
Note: we using sql server as database.
First
SQL server is the worst possible piece of software to implement this, especially if it is very dynamic. The reason being that engines like MEMORY and MyISAM have only full-table locks while more suitable engines like InnoDB have a higher write penalty (to provide ACID properties) and are optimized for accessing records that are spatially and temporally close, which is not the case with the scenario you've provided.
Solution With SQL SERVER
However, if you insist on using SQL server. Here is the solution -
The best way to implement a job queue in a relational database system is to use SKIP LOCK. Scheduling can be classified as a Graph problem where each transaction in SQL table can be regarded as one node and you're allowed to visit those nodes only once in a DAG fashion. To solve the problem, you need to use SKIP LOCKS with your Job Queue Schema
QueueMsgId identity -- NOT NULL
QueueMsgType varchar(20) -- NOT NULL
QueueState char(1) -- 'N' New if queued, 'A' Active if processing, 'C' Completed, default 'N' -- NOT NULL
CreateTime datetime -- default GETDATE() -- NOT NULL
QueueMsg varchar(255) -- NULLable
The QueueClient pulls one message and changes the state to Active while it works on it. When it's done it changes the state to Complete. SKIP LOCKED is a lock mechanism that acquires a lock to both read/share (FOR SHARE) or write/exclusive (FOR UPDATE).
If we have multiple concurrent users trying to access the Queue records, then we want to ensure that no-one can change the data after we have read it, we can use a repeatable read. By using that, any reads of any tables we do will be locked for Update or Delete. Execute the following SELECT queries which lock the Queue records exclusively while also adding the SKIP LOCKED option for Concurrency:
SELECT
q.QueueMsgId AS id1,
q.QueueMsgType AS msgType1,
q.QueueState AS state1,
FROM
Queue q
WHERE
q.QueueState = 'A'
ORDER BY
q.QueueMsgId
LIMIT 2
FOR UPDATE OF q SKIP LOCKED
For multiple users, execute the same query and there will be no-conflict for Queue records
Best Solution
DUMP SQL and shift to REDIS for high query per second throughput. Though Redis's Data Structure being concurrent is difficult to handle. Fortunately, we have an out-of-box solution for Job/Message Queue system in node.js - BULL, for cluster-based deployment check this example.