Search code examples
hadoopsqoopoozie

how to create/declare global variables in oozie?


I'm creating an oozie worklow where I need to have multiples shell actions but I'm facing the problem that for every shell action I have in my workflow I have to declare an environment variable meaning that if I have 10 shell actions I need to declare 10 times, my question is: if there's any way I can declare/create global variables in order to avoid duplicated variables that are doing the same?

Example:

    jon.properties
    oozie.use.system.libpath=true
    security_enabled=False
    dryrun=False
    nameNode=hdfs://localhost:8020
    user_name=test
    jobTracker=localhost::8032

<workflow-app name="My_Workflow" xmlns="uri:oozie:workflow:0.5">
<start to="shell-a0a5"/>
<kill name="Kill">
    <message>Error [${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="shell-a0a5">
    <shell xmlns="uri:oozie:shell-action:0.1">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <exec>script1.sh</exec>
        <file>/user/hive/script1.sh#script1.sh</file>
    </shell>
    <ok to="End"/>
    <error to="Kill"/>
</action>
<end name="End"/>

my script1.sh is expecting a parameter named as user_name which I have it declared into the job.properties but it's not working in my workflow I'm getting missing argument username

I would like to know how can i send parameters to a shell script from a global configuration file

Thanks


Solution

  • I was not able to create global parameters in order to pass values as : user & password, HADOOP_USER_NAME (in my ase ) but I was able to figure it out using shell script, so within the shell I define the following parameters for my proposal:

    export HADOOP_USER_NAME=admin;
    connection=$(hdfs dfs -cat /user/connection.txt)
    

    where the connection.txt contains all the information for the connection string then using sqoop I pass the info in this way within the shell file:

    sqoop $connection --table test --target-dir /user/hive/warehouse/Jeff.db/test/ --m 1 --delete-target-dir
    

    and in this way I was able to resolve my problem, I had to declare some global variables but those were necessary to execute sqoop in parallel using &.