Search code examples
aws-step-functions

Need to understand step function retry mechanism


May be this can be answered quickly. I am working on adding a retry mechanism in step functions. I want to keep retrying the failed activity for 24 hours at a interval of 1 hour or so.

I was going thorough the "Retrying after an error" section here: https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html It says that "The multiplier by which the retry interval increases during each attempt (2.0 by default)." And also given example of 1st retry after 3 secs and next retry after 4.5 secs because, BackoffRate is 1.5.

So is this not exponential backoff? Because, on the same page, under "Handling a failure using Retry" section, it says the backoff will be applied exponentially.


Solution

  • Here is how interval between retry works:

    Interval = IntervalSeconds*(BackoffRate)^(attempt-1)

    So for this config:

          "Retry": [
            {
              "ErrorEquals": ["States.ALL"],
              "BackoffRate": 2,
              "IntervalSeconds": 6,
              "MaxAttempts": 6
            }
          ]
    

    We have these intervals:

    attempt.      interval
    ---------------------------
    1             6*(2^0)  =  6
    2             6*(2^1)  =  12
    3             6*(2^2)  =  24