Search code examples
hadoopapache-pigoozieoozie-coordinator

How to pass pig option as parameter in oozie?


In order to execute my pig script, i need to turn off optimizer. Using below command in command line and scripts work fine.

 pig -t ColumnMapKeyPrune population.pig

How to pass this option in oozie?

I tried passing as argument.

<action>
<pig>
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>                      
    <script>Population.pig</script> 
    <argument>-t</argument>
    <argument>ColumnMapKeyPrune</argument>
    <param>piggybankJar=${piggybankJar}</param>
    <param>datafuJar=${datafuJar}</param>
    <param>inputPath=${inputPath}</param>
    <param>outputPath=${outputPath}</param>
</pig>
</action>

Received below error:

E0701: XML schema error, cvc-complex-type.2.4.a: Invalid content was found starting with element 'param'. One of '{"uri:oozie:workflow:0.4":argument, "uri:oozie:workflow:0.4":file, "uri:oozie:workflow:0.4":archive}' is expected.

Tried using param, but it didnt worked

 <action>
 <pig>
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>                      
    <script>Population.pig</script> 
    <param>-t</param>
    <param>ColumnMapKeyPrune</param>
    <param>piggybankJar=${piggybankJar}</param>
    <param>datafuJar=${datafuJar}</param>
    <param>inputPath=${inputPath}</param>
    <param>outputPath=${outputPath}</param>
</pig>
</action>

For pig oozie action allows only param, archive, file and argument tags. How to pass this optimizer_off option ?


Solution

  • set the below mentioned property in your pig script and try once.

    set pig.optimizer.rules.disabled 'ColumnMapKeyPrune';

    for reference, please go through the below mentioned link

    [http://pig.apache.org/docs/r0.14.0/perf.html#optimization-rules]