I have a PostgreSQL database (v9.5.3) that's hosting "jobs" for workers to pull, run, and commit back.
When a worker wants a job, it runs something to the effect of:
SELECT MIN(sim_id) FROM job WHERE job_status = 0;
-- status 0 indicates it's ready to be run
job
is a table with this schema:
CREATE TABLE commit_schema.job (
sim_id serial NOT NULL,
controller_id smallint NOT NULL,
controller_parameters smallint NOT NULL,
model_id smallint NOT NULL,
model_parameters smallint NOT NULL,
client_id smallint,
job_status smallint DEFAULT 0,
initial_glucose_id smallint NOT NULL
);
Afterwards, it uses this sim_id
to piece together a bunch of parameters in a JOIN
:
SELECT a.par1, b.par2 FROM
a INNER JOIN b ON a.sim_id = b.sim_id;
These parameters are then return to the worker, along with the sim_id
, and the job is run. The sim_id
is locked by setting job.job_status
to 1, using an UPDATE
:
UPDATE job SET job_status = 1 WHERE sim_id = $1;
The results are then committed using that same sim_id
.
Ideally,
Workers wouldn't under any circumstances be able to get the same sim_id
.
Two workers requesting a job won't error out, one will just have to wait to receive a job.
I think that using a serializable isolation level will ensure that the MIN()
always returns unique sim_id
's, but I believe this may also be achievable using a read committed isolation level. Then again, MIN()
may not be able to concurrently and deterministically give unique sim_id
's to two concurrent workers?
This should work just fine for concurrent access using the default isolation level Read Committed and FOR UPDATE SKIP LOCKED
(new in pg 9.5):
UPDATE commit_schema.job j
SET job_status = 1
FROM (
SELECT sim_id
FROM commit_schema.job
WHERE job_status = 0
ORDER BY sim_id
LIMIT 1
FOR UPDATE SKIP LOCKED
) sub
WHERE j.sim_id = sub.sim_id
RETURNING sim_id;
job_status
should probably be defined NOT NULL
.
Be wary of certain corner cases - detailed explanation in this related answer on dba.SE:
There are various ways to return from a function:
RETURN QUERY
with PL/pgSQL.RETURNING ... INTO
- which can be an OUT
parameter, so it will be returned at the end of the function automatically. Or any other variable and return it explicitly.Related (with code examples):