Search code examples
xmlhadoopworkflowooziefork-join

How to fork three different job which uses same generic workflow.xml with different list of parameters?


I am a beginner to hadoop eco-system. I am trying to fork three different jobs which I want to invoke from same generic workflow.xml file but pass different parameters to each of sub-workflows.

Sub-Workflow:

<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="special-fork">

    <global>
        <job-tracker>${jT}</job-tracker>
        <name-node>${nN}</name-node>
    </global>

    <fork name="special-fork">
        <path start="aa"/>
        <path start="bb"/>
        <path start="cc"/>
    </fork>

    <action name="aa">
        <sub-workflow>
            <app-path>${nN}/xyz/workflow.xml</app-path>
            <propagate-configuration/>
        </sub-workflow>
        <ok to="special-join"/>
        <error to="kill"/>
    </action>

    <action name="bb">
        <sub-workflow>
            <app-path>${nN}/xyz/workflow.xml</app-path>
            <propagate-configuration/>
        </sub-workflow>
        <ok to="special-join"/>
        <error to="kill"/>
    </action>

    <action name="cc">
        <sub-workflow>
            <app-path>${nN}/xyz/workflow.xml</app-path>
            <propagate-configuration/>
        </sub-workflow>
        <ok to="special-join"/>
        <error to="kill"/>
    </action>

    <join name="special-join" to="end"/>

    <action name="email-alert-fail">
        <email xmlns="uri:oozie:email-action:0.1">
            <to>${emailing_list}</to>
            <subject>Oozie workflow Failed</subject>
            <body>
            </body>
        </email>
        <ok to="kill"/>
        <error to="kill"/>
    </action>

    <kill name="kill">
        <message>Map-Reduce Failed</message>
    </kill>

    <end name="end"/>
</workflow-app>

I want to pass list of parameters like source,input_path,output_path,credentials which is different to all three processes aa,bb,cc. How can I propagate this to each of the three Sub-Workflow?

Thanks.


Solution

  • You can use the configuration tag of the sub-workflow action and pass the required parameters. This is how it will look like: (Add all the required properties)

    <action name="aa">
        <sub-workflow>
            <app-path>${nN}/xyz/workflow.xml</app-path>
            <propagate-configuration/>
            <configuration>
                <property>
                   <name>input_path</name>
                   <value>your_input_path</value>
                </property>
            </configuration>
        </sub-workflow>
        <ok to="special-join"/>
        <error to="kill"/>
    </action>