I have an AWS step function that times out after 5 minutes and I cannot come up with a reason why this would happen. This is my state machine definition:
{
"Comment": "This state machine will fire off a reminder lambda for an outage status update after a delay",
"StartAt": "Wait",
"States": {
"Wait": {
"Type": "Wait",
"Comment": "Wait for an hour",
"Seconds": 3600,
"Next": "Send Reminder"
},
"Send Reminder": {
"Type": "Task",
"Comment": "Send a reminder in Slack for a status update",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"Payload.$": "$",
"FunctionName": "${FunctionArn}"
},
"End": true
}
}
}
Whenever the state machine is kicked off, it enters the Wait
state, but after 5 minutes in that state, the entire machine times out. As you can see in the logs, it never reaches the Lambda function to execute it.
As I understand it, if I don't put a timeout on the state machine, it should not time out by itself until 1 year after starting. I've tried adding "TimeoutSeconds": 7200
to the machine definition to see if explicitly setting a timeout would help, but that doesn't change the outcome.
What could cause this?
You probably created an Express Workflow, they have a timeout of 5 minutes - see: Standard vs. Express Workflows.
Use a standard Workflow if you want a step function that can run for a longer time. You could use the TimeoutSeconds
parameter on the highest level of the state machine definition to limit the runtime (see State Machine Structure).