Search code examples
sqlpostgresqlqueueproducer-consumer

Job queue as SQL table with multiple consumers (PostgreSQL)


I have a typical producer-consumer problem:

Multiple producer applications write job requests to a job-table on a PostgreSQL database.

The job requests have a state field that starts contains QUEUED on creation.

There are multiple consumer applications that are notified by a rule when a producer inserts a new record:

CREATE OR REPLACE RULE "jobrecord.added" AS
  ON INSERT TO jobrecord DO 
  NOTIFY "jobrecordAdded";

They will try to reserve a new record by setting its state to RESERVED. Of course, only one consumer should succeed. All other consumers should not be able to reserve the same record. They should instead reserve other records with state=QUEUED.

Example: some producer added the following records to table jobrecord:

id state  owner  payload
------------------------
1 QUEUED null   <data>
2 QUEUED null   <data>
3 QUEUED null   <data>
4 QUEUED null   <data>

now, two consumers A, B want to process them. They start running at the same time. One should reserve id 1, the other one should reserve id 2, then the first one who finishes should reserve id 3 and so on..

In a pure multithreaded world, I would use a mutex to control access to the job queue, but the consumers are different processes that may run on different machines. They only access the same database, so all synchronization must happen through the database.

I read a lot of documentation about concurrent access and locking in PostgreSQL, e.g. http://www.postgresql.org/docs/9.0/interactive/explicit-locking.html Select unlocked row in Postgresql PostgreSQL and locking

From these topics, I learned, that the following SQL statement should do what I need:

UPDATE jobrecord
  SET owner= :owner, state = :reserved 
  WHERE id = ( 
     SELECT id from jobrecord WHERE state = :queued 
        ORDER BY id  LIMIT 1 
     ) 
  RETURNING id;  // will only return an id when they reserved it successfully

Unfortunately, when I run this in multiple consumer processes, in about 50% of the time, they still reserve the same record, both processing it and one overwriting the changes of the other.

What am I missing? How do I have to write the SQL statement so that multiple consumers will not reserve the same record?


Solution

  • Read my post here:

    https://stackoverflow.com/a/6500830/32688

    If you use transaction and LOCK TABLE you will have no problems.