Search code examples
javaamazon-web-servicesamazon-sqsschedulingspring-jms

How to delay retry by 4 hours on SQS?


TL;DR: how to mimic rabbitMQ's scheduling functionality keeping the consumer:

  1. stateless
  2. free from managing scheduled messages
  3. free from useless retries from scheduled messages between receiving the message and finally consuming it the correct scheduled time

I have a single SQS queue with default properties on creation. The average time a consumer takes to process a message is 1~2s. But a few messages needs to be processed twice, between a 4h window. These messages are called B, and the others are called A.

Suppose I have my queue with the following messages: A1, A2, B1, A3, B2 (5 messages, max 10s to consume them all) at the start of these table:

time     | what should happen
---------|-------------------
now      | consumer connected to queue
now+10s  | all As were consumed successfully and deleted from queue
           Bs had their unsuccessful first try and now they are waiting for their retry in 4h
between  | nothing happens since no new messages arrived and old ones are waiting
now+4h4s | Bs successfully consumed during second retry and due that, deleted from queue

I have a Spring application where I can throw exceptions when I find a type B message. Due simplicity and scalability, I want to have one single thread consuming messages taking 1~2s to consume each message.

This way, I cannot hang message processing as this answer suggested. I also don't need SQS' Delivery delay since it postpones just the messages arriving at queue and not retries. If possible, I would like to keep using long polling @JmsListener and avoid at all keeping any state on my memory's application. I want to avoid this if possible


Solution

  • I would write a small AWS Lambda function that gets invoked every ~minute. That function would get a message (off the hopefully FIFO-type SQS queue) and check the time it was added. If it was added >= 4 hours, it would delete it off the incoming queue and add it to the delayed by 4 hour queue, which your application could listen to. If it moved a message, continue to do so until the next message isn't 4 hours old. Increase/decrease the frequency of the lambda to increase the granularity of how 'tight' to 4 hours you are, but at the added expense of running the lambda more often.

    Here is a quick link to an example of an AWS Lambda function using SQS: https://docs.aws.amazon.com/lambda/latest/dg/with-sqs-example.html