Search code examples
amazon-web-servicesaws-lambdaaws-batch

Aws batch jobs scheduled in a sequence . If any job is failed, invoke the job and sequence from failed job


I have a lambda to invoke AWS batch jobs in sequence (dependent jobs). If one batch job in the sequence fails, all jobs thereafter get failed.

Is it possible to invoke that batch job from the last failed and then restart the sequence right from that batch job?


Solution

  • No, Batch jobs are immutable. If job 2 depends on job 1, then when job 1 fails, you cannot change job 2 to depend on a new job 3 that you create to replace job 1.

    Instead you'll need to create a new sequence of jobs that are identical to the original set of jobs starting from the point that failed.

    Batch also supports automatic retries (configured through retryStrategy.attempts). If a job fails, it will be automatically retried up to attempts times. If it succeeds at any point during those attempts, then the dependent jobs can run as if the job succeeded on the first try.