Search code examples
amazon-emraws-step-functions

AWS steps parallel state to orchestrate EMR jobs


We are orchestrating data pipeline with AWS steps and we do need to run EMR jobs in parallel. I have tried using Map state and it works as expected. The only problem with Map is that in case one step fails , it cancels all the other steps as well. To overcome this issue , I am thinking if we can create an array of steps and pass it dynamically to Branches in parallel state but I have not been able to do it as it is not accepting strings. Is there a workaround for this or can we only hard code branches in Parallel state? Can States.Array() in someway be helpful in this situation?


Solution

  • Wrap the inner state machine in a one-branch parallel state and add error/retry policies to it. Basically, you want to catch all errors and ensure that the iteration always succeeds.