Search code examples
mysqlmessage-queueamazon-sqsproducer-consumer

Faster Producer in Producer/Consumer Queue Pattern


We have a MySQL table which gets about 5 million records written to it daily.

Each record requires a bit of time to process some meta data associated with it. So we have a SINGLE "producer" process which send each record ID to a message queue (SQS).

We then have 50 "consumers" which pull down each ID and conduct the necessary processing tasks.

This pattern works well. However, the volume of data continues to grow. Our single producer can no longer keep up with the amount of data being inserted into the table.

I understand that we can add more consumers to speed up processing. But, what is a good strategy for speeding up the producer without running into ID collisions?

UPDATE

Here is the table structure:

id  int(10) AUTO_INCREMENT
name varchar(255)   
is_processed tinyint(1)
is_queued   tinyint(1)
created_at  timestamp
updated_at  timestamp
meta_data   text

I would like to have multiple producers running, but don't know how to avoid concurrency problems.


Solution

  • I ended up using this approach: Best practices for multithreaded processing of database records - this dispatches sets of records for multiple consumers to pick up. It is much faster than using a queue to process individual records.