Search code examples
amazon-web-servicesaws-lambdaamazon-sqs

AWS lambda SQS does not allow to process 1 message per 1 invocation


I am trying to set up what I thought initially was an easy pipeline.

I have 2 SQS queues and 2 lambdas that should work sequentially.

The problem is that my lambdas do a heavy job so I want 1 message processed by 1 lambda invocation. For this reason, I have set the Batch size in the lambda SQS triggers to 1 and my lambdas start as follows:

@idempotent(persistence_store=persistence_layer)
def lambda_handler(event, context):
    record = event['Records'][0]
    body = json.loads(record['body'])

    task_id = body['taskId']

    set_state_in_dynamodb(task_id, 'Running')

    # Do all the job

    set_state_in_dynamodb(task_id, 'Success')

In my world, everything should have worked as follows:

N messages go into Queue1 each triggering an invocation of Lambda 1. If there are limitations on parallel execution, I am good with that. But not 1. After Lambda 1 finishes its work it puts M messages into Queue 2 which triggers Lambda 2 in the same way.

Instead, I see the following behavior:

I put 10 messages into Queue 1, they are all dequeued, however, only the state of a subset ( random ) of the tasks is updated, i.e. set to Running. After the invisibility period expires, the same happens.

As far as I understand, AWS dequeues more than 1 message and invokes the corresponding lambda passing only the first one, however, the others remain invisible and we should wait until the invisibility period is passed so that they are dequeued in the same manner. So, after a few such iterations, I can get a message that was never processed before but has a dequeue count, say 3.

This thing gets worse on Queue 2 and Lambda 2.

I tried to change the type of the queue from Standard to FIFO and provided a different MessageGroupId for each message, which resulted in the same behavior.

What am I doing wrong?

I just want my messages to be processed in parallel and only once.


Solution

  • The issue was the limit on the number of AWS lambda parallel threads. It was 10 by default. I made a request to increase it to 1000 and everything works well now.