Search code examples
hadoophdfsoozieoozie-coordinator

Creating directory in hdfs using oozie


In a oozie workflow how can we create a directory in HDFS and Copy files from Linux to HDFS

I want to do the following in workflow

hdfs dfs -mkdir -p /user/$USER/logging/`date "+%Y-%m-%d"`/logs


hdfs dfs -put /home/$USER/logs/"${table}" /user/$USER/logging/`date "+%Y-%m-%d"`/logs/

How can I achieve that?

I have tried the following but without success

<action name="Copy_to_HDFS">
    <fs>
        <mkdir path='/user/$USER/logging/`date "+%Y-%m-%d"`/logs'/>
        <move source='/home/$USER/logs${table}' target='/user/$USER/logging/`date "+%Y-%m-%d"`/logs/'/>
    </fs>
    <ok to="end"/>
    <error to="end"/>
</action>

How can we create a folder with the name of that particular date?

Full workflow:

    <workflow-app name="Shell_hive" xmlns="uri:oozie:workflow:0.5">
    <start to="shell-b8e7"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="shell-b8e7">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>shell.sh</exec>
              <argument>${table}</argument>
           <env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
           <env-var>HADOOP_CONF_DIR=/etc/hadoop/conf</env-var>
           <file>/user/$USER/oozie/scripts/lib/shell.sh#shell.sh</file>   
        </shell>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <action name="Copy_to_HDFS">
        <fs>
            <mkdir path="/user/$USER/logging/2017-04-24/logs"/>
            <move source="/tmp/logging/${table}" target="/user/$USER/logging/$(date +%Y-%m-%d)/logs/"/>
        </fs>
        <ok to="end"/>
        <error to="end"/>
    </action>
        <end name="End"/>
    </workflow-app>

Solution

  • The problem is with the quotes, single quotes(') block the variable expansion.

    <action name="Copy_to_HDFS">
        <fs>
            <mkdir path="/user/$USER/logging/$(date +%Y-%m-%d)/logs"/>
            <move source="/home/$USER/logs${table}" target="/user/$USER/logging/$(date +%Y-%m-%d)/logs/"/>
        </fs>
        <ok to="end"/>
        <error to="end"/>
    </action>
    

    UPDATE: The Copy_to_HDFS action is never invoked. The action shell-b8e7 on success is sent to End action. Modify the workflow accordingly to invoke the Copy_to_HDFS action. For example,

    </shell>
    <ok to="Copy_to_HDFS"/>