Search code examples
javaamazon-web-serviceshadoopoozieoozie-workflow

How to point centralized location for multiple workflows in oozie


I have more than 10 oozie workflows. Each workflow.xml, coordinator.properties and xml plus lib folder is in a separate folder. All the workflow have some common jars around 6mb size and I have to copy same jar in each lib folder. Can you please tell me the best possible solution to have a common jar location, so that i don't need to copy same jar in each workflow folder for java actions.

<action name="aggr_stage" retry-max="3" retry-interval="1">
    <java>
        <main-class>com.*.*.ReportGenerator</main-class>
        <arg>${reprocessing}</arg>
        <arg>${timeZone}</arg>
    </java>
    <ok to="notifyJobSuccess" />
    <error to="notifyJobFailure" />
</action>

Solution

  • Add a <file> tag in your oozie action. Refer same file for each and every action.
    see oozie docs

    <action name="aggr_stage" retry-max="3" retry-interval="1">
        <java>
            <main-class>com.*.*.ReportGenerator</main-class>
            <arg>${reprocessing}</arg>
            <arg>${timeZone}</arg>
            <file>hdfs://<namenode>:<port>/<path-to-your-jar>/your-report-generator.jar</file>
        </java>
        <ok to="notifyJobSuccess" />
        <error to="notifyJobFailure" />
    </action>
    

    Instead of using full hdfs path you can also use relative path to that jar from your workflow.xml in your file tag to add the file.
    example: ../../your-report-generator.jar