I have a workflow in oozie
. In this workflow I want to pass a table name as an argument. The table names are present in a file tables.txt
I want to pass the table names from tables.txt
to the workflow.
<workflow-app name="Shell_test" xmlns="uri:oozie:workflow:0.5">
<start to="shell-8f63"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="test_shell">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>shell.sh</exec>
<argument>${table}</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>/user/oozie/lib/test_shell.sh#shell.sh</file>
<file>/user/oozie/input/tables.txt#tables.txt</file>
</shell>
<ok to="End"/>
<error to="email-error"/>
</action>
<action name="email-error">
<email xmlns="uri:oozie:email-action:0.2">
<to>xxxxxxxxxx.com</to>
<subject>Status of workflow ${table}</subject>
<body>The workflow ${table} ${wf:id()} had issues and was killed. The error message is: ${wf:errorMessage(wf:lastErrorNode())}</body>
<content_type>text/plain</content_type>
</email>
<ok to="end"/>
<error to="end"/>
</action>
<end name="End"/>
</workflow-app>
I was able to do this using the following in the workflow.
<argument>${input_file}</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>/user/oozie/lib/test_shell.sh#shell.sh</file>
<file>/user/oozie/input/${input_file}#${input_file}</file>
Now I have a problem.
Say If the workflow fails for one of the tables in the input_file
then I am not getting any email. I am getting email only if the workflow fails for the last table in the input_file
.
Why is this happening and How can I get an email for every time the workflow fails?
Or am I doing the whole process wrong.
Could anyone please explain and correct me where I am doing things in a wrong way.
My test_shell.sh
while read line ;do
spark-submit --name "SparkJob" --master "yarn-client" test.py $line
done < tables.txt
Shell action will not behave as shh action, as the shell action workflow will run on one of the data nodes it will treat the script errors as warnings until and unless you do exit 1 inside your script. The other way to receive emails on failure is using the email utility inside the script, something like this before doing exit 1 echo 'script X returned an error due to some reason; Please check the workflow for validation' | mailx -r oozie -s 'SUBJECTTOEMAIL' [email protected]
to make the email utility work from data node make sure your data nodes have the email utility installed. If not installed you can do ssh on the email part to your edge node which looks something like this ssh -o StrictHostKeyChecking=no ${edgeUser}@${edgeHost} "echo 'script X returned an error due to some reason; Please check the workflow for validation' | mailx -r oozie -s 'SUBJECTTOEMAIL' [email protected]"
I can suggest you some changes to your workflow which might give you better results on reflecting the errors and utilizing the email action in the workflow
Don't call the config file from the shell action itself, instead, you can do as below
<workflow-app name="Shell_test" xmlns="uri:oozie:workflow:0.5">
<start to="shell-8f63"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="test_shell">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>shell.sh</exec>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>/user/oozie/lib/test_shell.sh#shell.sh</file>
<file>/user/oozie/input/tables.txt</file>
</shell>
If you notice the changes I made to your workflow, I'm just calling the tables.txt as a file instead of actually making it as an execution by removing the #tables.txt
When you do that what happens is shell action will actually copy that file and store in the container that it is running, so to utilize the table.txt config file, inside the script you will call like this . ./tables.txt
because the container has already copied so you can call the tables.txt as it is in the home directory.
Hopefully, this will help you...!!! please comment if you have any questions on the solution I've suggested.