Search code examples
amazon-web-servicesaws-lambdaamazon-sqs

AWS SQS -> Lamda connection: determine how many times an event has fired inside lambda


I have a lambda that:

  • receives an event from SQS
  • sends call to third-party service
  • on failure records a Rollbar SQS will retry event 3 times then send to DLQ

The problem is that it fails quite often but succeeds on retry. It triggers Rollbar that I have to go and check by hand.

I wondered if there was a way to record a Rollbar ONLY if this is the last of 3 retries and it will go to DLQ now.

There is ApproximateReceiveCount that seems to work but I'm concerned about it having Approximate in its name

  1. Is ApproximateReceiveCount reliable way to determine if the lambda is going to DLQ next?
  2. Is there any other fast and reliable way to implement that?

This is an example of how I thought of doing that

@rollbar.lambda_function
def handler(event, context):
    approximate_receive_count = event['Records'][0]['attributes']['ApproximateReceiveCount']

    
    if int(approximate_receive_count) < MAX_RECEIVE_COUNT:
        raise Exception("Simulate transient server failure")
    
    if int(approximate_receive_count) >= MAX_RECEIVE_COUNT:
        raise Exception('Simulate final failure')

    return {
        'statusCode': 200,
        'body': json.dumps({'message': 'Lambda function executed successfully.'})
    }

Solution

  • I think it's called ApproximateReceiveCount because SQS does not guarantee only-once delivery and the same message could be in some edge cases picked up by two lambdas at the same time (in which case both would have the same value for ApproximateReceiveCount. If you really cannot tolerate these cases then you can't use ApproximateReceiveCount attribute. For precise solution, you could:

    1. Store counts in an external DB, like DynamoDB, keyed by message ID or some other unique ID.
    2. Rely on messages being in DLQ instead. Not sure how that integrates with Rollbar, as I haven't used that, but you can easily set up a DLQ alarm in CloudWatch and wire that to external system or even create another lambda that processes DLQ messages and performs some custom processing that you need