Search code examples
node.jsaws-sdkaws-step-functions

Update (skip) wait duration on AWS Step Function


I have a Step Function set up that has a 'wait' state (eg, 999999 seconds). Once the wait is over, the Step Function invokes a Lambda. Sometimes, I will want to interrupt the wait time and trigger the Lambda immediately. Is this possible?

I thought I could do it by using the aws-sdk with the Step Functions API to manually skip the wait; but I've been experimenting with no success.

I tried the API's Start Execution method, but it is only for starting the entire Step Function (https://docs.aws.amazon.com/step-functions/latest/apireference/API_StartExecution.html) I can't find anything for manipulating individual steps.

I can use GetExecutionHistory to return an event object that describes the Wait step, eg:

{
    timestamp: 2022-10-17T08:38:27.849Z,
    type: 'WaitStateEntered',
    id: 2,
    previousEventId: 0,
    stateEnteredEventDetails: {
      name: 'Wait',
      input: '{\n    "Comment": "Insert your JSON here"\n}',
      inputDetails: {truncated: false}
    }
  },

But there doesn't seem to be a way to manipulate this event to move to the next step.


Solution

  • I've spoken to AWS tech support who have confirmed that there is nothing in the aws-sdk or the aws-cdk that provides for the update of an existing state (eg, a 'wait' state) while it is running. There are some workarounds:

    1. AWS tech support suggest Iterating a loop using a Lambda. This basically loops over a Choice>Wait>Lambda>(repeat) where the Lambda returns an output that tells the Choice whether to continue with the loop or else direct the Execution to another state. The advantage of this is that we don't need to cancel the Execution and we maintain a simpler record of activities. The disadvantage is that we are regularly invoking a Lambda.
    2. As per @Guy's suggestion, we could split the Step Function into two separate Step Functions. This means we could cancel the initial Step Function and then trigger the latter Step Function manually.

    We can cancel the execution of a Step Function with stopExecution. For example, using the aws-sdk:

    import { config, Credentials, StepFunctions } from "aws-sdk"; // package.json:   "aws-sdk": "^2.1232.0",
    
    config.update({ region: "eu-west-2" });
    const stepFunctions = new StepFunctions();
    
    const stoppedExecution = await stepFunctions
     .stopExecution({
       executionArn: "...",
       cause: "...",
       error: "...",
      })
     .promise();
    

    We can then trigger a new Step Function with startExecution

    1. Step Functions also allow us to Wait for a callback with the Task Token. Basically, the Execution step state will send a task token (eg, to a Lambda), then wait to be returned the Task Token. Once received the Execution will proceed to the next step.

    There are two ways I can think of proceeding from above item 3.:

    a. Configure a Heartbeat Timeout for a Waiting Task. If the Heartbeat Timeout ends without a response token being received, the task fails with a States.Timeout error name. We can (I assume) handle the error in the Task rule to trigger the next step anyway. So the default behaviour is now to trigger the next step after a duration elapses, and then we also have the facility to skip the wait duration by sending the Task Token back to the Execution.

    b. Use another Service to perform the wait function and return the Task Token after the wait duration has elapsed.