I'm trying to manage Lambda retries in a situation where Eventbridge asynchronously invokes a Lambda function via an events rule (see template at bottom)
I've tried to configure retry behavour on both Eventbridge and Lambda sides, in particular -
Event rule max retry attempts set to zero, and dead letter queue configured
Lambda event config configured with max retry attempts also set to zero, and Lambda destination queue also configured
I can push a "good" message to Eventbridge -
{'action': 'add', 'args': {'x': 2, 'y': 2}}
and this gets picked up by Lambda -
[INFO] 2021-11-19T06:56:25.242Z 590c6514-ad4d-4906-a748-9820af748e76 received: {'version': '0', 'id': '62f363a1-9e0e-a154-8d6a-bce81d22d47f', 'detail-type': 'foobar', 'source': 'whatevs', 'account': '119552584133', 'time': '2021-11-19T06:56:24Z', 'region': 'eu-west-1', 'resources': [], 'detail': {'action': 'add', 'args': {'x': 2, 'y': 2}}}
[INFO] 2021-11-19T06:56:25.242Z 590c6514-ad4d-4906-a748-9820af748e76 result: 4
I can also send a "bad" message to Eventbridge -
{'action': 'add', 'args': {'x': 1, 'y': 'a'}}
and this results in a Lambda error -
[INFO] 2021-11-19T06:50:49.603Z b25129f4-d89a-493c-b85e-7ffaef995c71 received: {'version': '0', 'id': '8bb8b3d2-3725-8a24-19ea-547a6a8b799d', 'detail-type': 'foobar', 'source': 'whatevs', 'account': '119552584133', 'time': '2021-11-19T06:47:53Z', 'region': 'eu-west-1', 'resources': [], 'detail': {'action': 'add', 'args': {'x': 1, 'y': 'x'}}}
[ERROR] TypeError: unsupported operand type(s) for +: 'int' and 'str'Traceback (most recent call last): File "/var/task/index.py", line 7, in handler result=args["x"]+args["y"]
So far so good - but problem is I still get standard Lambda retry behaviour at approx T+60 and T+180 seconds, resulting in further errors -
[INFO] 2021-11-19T06:52:46.142Z 897efce2-bb04-45d8-8b3b-4e1e854cdc13 received: {'version': '0', 'id': '56252e23-dbb1-8025-9eda-45cecaa9f04e', 'detail-type': 'foobar', 'source': 'whatevs', 'account': '119552584133', 'time': '2021-11-19T06:52:45Z', 'region': 'eu-west-1', 'resources': [], 'detail': {'action': 'add', 'args': {'x': 1, 'y': 'a'}}}
[ERROR] TypeError: unsupported operand type(s) for +: 'int' and 'str'Traceback (most recent call last): File "/var/task/index.py", line 7, in handler result=args["x"]+args["y"]
[INFO] 2021-11-19T06:53:50.326Z 897efce2-bb04-45d8-8b3b-4e1e854cdc13 received: {'version': '0', 'id': '56252e23-dbb1-8025-9eda-45cecaa9f04e', 'detail-type': 'foobar', 'source': 'whatevs', 'account': '119552584133', 'time': '2021-11-19T06:52:45Z', 'region': 'eu-west-1', 'resources': [], 'detail': {'action': 'add', 'args': {'x': 1, 'y': 'a'}}}
[ERROR] TypeError: unsupported operand type(s) for +: 'int' and 'str'Traceback (most recent call last): File "/var/task/index.py", line 7, in handler result=args["x"]+args["y"]
[INFO] 2021-11-19T06:55:59.477Z 897efce2-bb04-45d8-8b3b-4e1e854cdc13 received: {'version': '0', 'id': '56252e23-dbb1-8025-9eda-45cecaa9f04e', 'detail-type': 'foobar', 'source': 'whatevs', 'account': '119552584133', 'time': '2021-11-19T06:52:45Z', 'region': 'eu-west-1', 'resources': [], 'detail': {'action': 'add', 'args': {'x': 1, 'y': 'a'}}}
[ERROR] TypeError: unsupported operand type(s) for +: 'int' and 'str'Traceback (most recent call last): File "/var/task/index.py", line 7, in handler result=args["x"]+args["y"]
And the offending event never ends up in either the events DLQ nor the Lambda destination.
What am I missing here, and what do I need to do to turn off these retries and have the event show up in a DLQ/destination ?
(and for good measure, should error handling / retries be configured on the Eventbridge or Lambda sides ? Surely I don't need both ?)
AWSTemplateFormatVersion: '2010-09-09'
Outputs:
MyEventBus:
Value:
Ref: MyEventBus
MyEventsDLQ:
Value:
Ref: MyEventsDLQ
MyFunctionDestination:
Value:
Ref: MyFunctionDestination
Parameters:
LambdaHandlerName:
Default: "index.handler"
Type: String
LambdaSize:
Default: 512
Type: Number
LambdaRuntime:
Default: 'python3.8'
Type: String
LambdaTimeout:
Default: 5
Type: Number
Resources:
MyFunction:
Properties:
Code:
ZipFile: |
import logging
logger=logging.getLogger()
logger.setLevel(logging.INFO)
def handler(event, context):
logger.info("received: %s" % event)
args=event["detail"]["args"]
result=args["x"]+args["y"]
logger.info("result: %s" % result)
Handler:
Ref: LambdaHandlerName
MemorySize:
Ref: LambdaSize
Role:
Fn::GetAtt:
- MyFunctionRole
- Arn
Runtime:
Ref: LambdaRuntime
Timeout:
Ref: LambdaTimeout
Type: AWS::Lambda::Function
MyFunctionRole:
Properties:
AssumeRolePolicyDocument:
Statement:
- Action: sts:AssumeRole
Effect: Allow
Principal:
Service: lambda.amazonaws.com
Version: '2012-10-17'
Policies:
- PolicyDocument:
Statement:
- Action: logs:*
Effect: Allow
Resource: '*'
- Action: sqs:*
Effect: Allow
Resource: '*'
Version: '2012-10-17'
PolicyName:
Fn::Sub: my-function-role-policy-${AWS::StackName}
Type: AWS::IAM::Role
MyEventsFunctionPermission:
Properties:
Action: lambda:InvokeFunction
FunctionName:
Ref: MyFunction
Principal: events.amazonaws.com
SourceArn:
Fn::GetAtt:
- MyEventRule
- Arn
Type: AWS::Lambda::Permission
MyEventRule:
Properties:
EventBusName:
Ref: MyEventBus
EventPattern:
detail:
action:
- add
State: ENABLED
Targets:
- Arn:
Fn::GetAtt:
- MyFunction
- Arn
Id:
Fn::Sub: my-rule-${AWS::StackName}
RetryPolicy:
MaximumRetryAttempts: 0
DeadLetterConfig:
Arn:
Fn::GetAtt:
- MyEventsDLQ
- Arn
Type: AWS::Events::Rule
MyEventBus:
Properties:
Name:
Fn::Sub: my-event-bus-${AWS::StackName}
Type: AWS::Events::EventBus
MyEventsDLQ:
Properties: {}
Type: AWS::SQS::Queue
MyEventsDLQPolicy:
Properties:
Queues:
- Ref: MyEventsDLQ
PolicyDocument:
Statement:
- Action: sqs:SendMessage
Effect: Allow
Principal:
Service: events.amazonaws.com
Type: AWS::SQS::QueuePolicy
MyFunctionDestination:
Properties: {}
Type: AWS::SQS::Queue
MyFunctionEventConfig:
Properties:
DestinationConfig:
OnFailure:
Destination:
Fn::GetAtt:
- MyFunctionDestination
- Arn
FunctionName:
Ref: MyFunction
MaximumRetryAttempts: 0
Qualifier:
Fn::GetAtt:
- MyFunctionVersion
- Version
Type: AWS::Lambda::EventInvokeConfig
MyFunctionVersion:
Properties:
FunctionName:
Ref: MyFunction
Type: AWS::Lambda::Version
Try setting Qualifier: $LATEST
on MyFunctionEventConfig
.
As you say, the observed behaviour is consistent with the MyFunctionEventConfig
Destination not being called at all. I suspect that is because you have qualified the Destination with a newly created Lambda version MyFunctionVersion
. But I do not believe you are ever invoking that version. So the Destination also never gets invoked.
Unless your AWS::Lambda::Version
is doing work for you, you can delete it and use Qualifier: $LATEST
.
Edit - Further info:
Triggers and destinations are version dependent, as each lambda version has its own ARN.
You can test this in the lambda console without redeploying. If the version-hypothesis is correct, the destination will not appear in the "Function overview" section of the lambda console, UNLESS you first select the snapshotted version.