I have a workflow that ingests/transforms a large CSV, splits it into 20 smaller pieces, and uploads them to Postgres
using SQS
/Lambda
. Essentially, each of the 20 messages in the queue should represent one file. This is causing issues because messages are being grouped together such that one in flight message will try to process multiple messages/files. The SQS
event basically looks like this:
{
"Records": [
{
"messageId": guid,
"receiptHandle": hash ,
"body": file path 1,
},
{
"messageId": guid,
"receiptHandle": hash ,
"body": file path 2,
},
{
"messageId": guid,
"receiptHandle": hash ,
"body": file path 3,
}
]
}
Where it tries to process all three with one lambda invocation. Ideally, I would like 20 invocations of one file like this:
{
"Records": [
{
"messageId": guid,
"receiptHandle": hash ,
"body": file path 1,
},
]
}
Is this possible?
According to the documentation you should set the BatchSize
property to 1
.
BatchSize The maximum number of items to retrieve in a single batch.
If you are using the AWS Console, go on your lambda, click on Add Trigger
select SQS
and set the BatchSize
to 1
:
In CloudFormation using sam template:
YourFunction:
Type: AWS::Serverless::Function
Properties:
Events:
SQSEvent:
Type: SQS
Properties:
Queue: YourQueue
BatchSize: 1