Search code examples
aws-cdkaws-step-functions

Aws step function timeout not catched by error handler


I have the following state where the timeout is not catch nevertheless there is the catch with "States.ALL". According to here https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html it should be. Can you tell me what is wrong?

   "PublishIotCmd&WaitTask": {
      "Next": "SuccedTask",
      "Retry": [
        {
      [..]
        }
      ],
      "Catch": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "ResultPath": "$.error",
          "Next": "ErrorHandlerTask"
        }
      ],
      "Type": "Task",
      "TimeoutSeconds": 600,
      "ResultPath": "$.cmdResult",
      "Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
      "Parameters": {
        "FunctionName": "xx",
        "Payload": {
          "token.$": "$$.Task.Token",
          "request.$": "$.detail"
        }
      }
    },

On the specific case the timeout is due to the task not getting the token with sendTaskSuccess. The error is, of course, this one but "ErrorHandlerTask" is not called, the state machine just hangs.

const publishIot = new tasks.LambdaInvoke(this, 'PublishIotCmd&WaitTask', {
  lambdaFunction: iotSendCommandFn,
  payload: sfn.TaskInput.fromObject({
    token: sfn.JsonPath.taskToken,
    //request: sfn.JsonPath.entirePayload,
    request: sfn.JsonPath.stringAt('$.detail'),
  }),
  resultPath: '$.cmdResult',
  integrationPattern: sfn.IntegrationPattern.WAIT_FOR_TASK_TOKEN,
  timeout: Duration.minutes(TIMEOUT_WAIT_REPLY_SECONDS),

Thank you in advance


Solution

  • With task tokens, I believe you're supposed to use the Heartbeat timeout rather than a general timeout.

    In the docs it calls out "The "HeartbeatSeconds": 600 field sets the heartbeat timeout interval to 10 minutes." and that "If the waiting task doesn't receive a valid task token within that 10-minute period, the task fails with a States.Timeout error name."

    I think since it's a different service integration Heartbeat works here.

    https://docs.aws.amazon.com/step-functions/latest/dg/connect-to-resource.html#connect-wait-token