I have a Lambda function in Python 3.7 which gets invoked explicitly and synchronously via Boto3. It is configured with a timeout of 5 minutes. While the 1st invocation is in progress for a minute, a second invocation is started with another request ID. Another minute later, the request is again retried, and this time it finished quicky. as it detects the system state has changed. An example sequence of invocations
The request ID that's returned from my Lambda call is d9d60271-4626-43ab-bb3a-f14be057af13.
The stack that invokes the Lambda is as follows:
InvocationType="RequestResponse"
.None of the invocations exits with an unhandled exception or any obvious error. Is Lambda retrying my calls? Could any other element of this stack, notably Jenkins, be "transparently" retrying the task, and if so, how would I establish that?
The problem turned out to arise from Boto3 on the client-side, and not from Lambda! When a lambda returns nothing for more than a minute, Boto3 by default times out. To fix this, I had to override Boto's default config:
lambda_config = botocore.config.Config(region_name=region, read_timeout=300)
lambda_client = boto3.client('lambda', config=lambda_config)
The way I demonstrated this was a client-side problem was by setting the parameter reserved_concurrent_executions to 1, thus preventing concurrent executions of my Lambda, and noting that I get an exception from Boto.