I am trying to implement an infinite retry of a lambda function through step functions -
{
"Comment": "A description of my state machine",
"StartAt": "Check Export Status",
"States": {
"Check Export Status": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"OutputPath": "$.Payload",
"Parameters": {
"Payload.$": "$",
"FunctionName": "arn:aws:lambda:eu-west-1:xxxx:function:xxxx:$LATEST"
},
"Next": "Glue StartJobRun",
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 1,
"IntervalSeconds": 60,
"MaxAttempts": 0
}
]
},
"Glue StartJobRun": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun",
"ResultPath": "$.error",
"Parameters": {
"JobName": "glue job test"
},
"End": true
}
}
}
Somehow when the step function starts executing it just executes once and fails and exits rather than trying infinite number of times. What am i missing ?
You cannot retry indefinitely. From the documentation, it is mentioned this:
MaxAttempts (Optional)
A positive integer that represents the maximum number of retry attempts (3 by default). If the error recurs more times than specified, retries cease and normal error handling resumes. A value of 0 specifies that the error or errors are never retried. MaxAttempts has a maximum value of 99999999.
Here is the link for reference: https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html#error-handling-retrying-after-an-error
So you can retry 99999999 times, which is still quite a lot.