I have an architecture which looks like that:
The ECS task uses a dockerized
Python container, and uses boto3
SQS client to retrieve and parse the SQS message:
sqs_response = get_sqs_task_data('<sqs_queue_url>')
sqs_message = parse_sqs_message(sqs_response)
while sqs_message is not None:
# Process it
# Delete if from the queue
# Get next message in queue
sqs_response = get_sqs_task_data('<sqs_queue_url>')
sqs_message = parse_sqs_message(sqs_response)
def get_sqs_task_data(queue_url):
client = boto3.client('sqs')
response = client.receive_message(
QueueUrl=queue_url,
MaxNumberOfMessages=1
)
return response
def parse_sqs_message(response_sqs_message):
if 'Messages' not in response_sqs_message:
logging.info('No messages found in queue')
return None
# ... parse it and return a dict
return {
data_1 = ...,
data_2 = ...
}
All in all, pretty straightforward.
In get_sqs_data()
, I explicitely specify that I want to retrieve only one message (because 1 ECS task has to process only one message).
In parse_sqs_message()
, I test if there are some messages left in the queue with
if 'Messages' not in response_sqs_message:
logging.info('No messages found in queue')
return None
When there is only one message in the queue (meaning one ECS task has been triggered), everything is working fine. The ECS task is able to pick the message, process it and delete it.
However, when the queue is populated with X messages (X > 1
) at the same time, X ECS task are triggered, but only ECS task is able to fetch one of the message and process it.
All the others ECS tasks will exit with No messages found in queue
, although there are X - 1
messages left to be processed.
Why is that? Why are the others task not able to pick the messages left to be picked?
If that matters, the VisibilityTimeout
of SQS is set to 30mins.
Any help would greatly be appreciated!
Feel free to ask for more precision if you want so.
I forgot to give an answer to that question.
The problem was the fact the the SQS was setup as a FIFO queue. A FIFO Queue only allows one consumer at a time (to preserve the order of the message). Changing it to a normal (standard) queue fixed this issue.