Search code examples
hadoopoozieoozie-coordinator

How to suspend Oozie coordinator if a coordinator action fails?


In my use case output of a coordinator action is used by next iteration, so if one of them fails, unfinished/corrupted data is fed into next coordinator action.

Is there any way to suspend an Oozie coordinator if one of the workflow steps in a coordinator action fails?

For example instead of:

<action name="Delete_TMP_Files">
  <fs>
    <delete path='${outputPath}*'/>
  </fs>
  <ok to="End"/>
  <error to="Kill"/>
</action>

Can we do something like:

<action name="Delete_TMP_Files">
  <fs>
    <delete path='${outputPath}*'/>
  </fs>
  <ok to="End"/>
  <error to="Suspend"/>
</action>

so the error can be diagnosed before its output gets overwritten by next coordinator action?

PS: fs>delete is not the actual use case here, just an example.


Solution

  • You can not suspend a coordinator based on the failure of a workflow (from coordinator action).

    If the output of the workflow have a certain pattern then you can use that and check it at the start of the workflow.

    Otherwise, you can always touch a file as a last action in the workflow whenever it is successful and in case of failure, delete (if same file, not date based). Use the same file as the first check in your workflow and proceed accordingly. Initially, you might need to create the file manually.

    You can use the email action in case of failure and get notified.

    This is just a work around.