Search code examples
hadoopclasspathapache-pigoozie

Oozie pig action change guava dependency for job jar


How can I configure an oozie pig action to give precedence to the user.classpath ? Pig version 0.10.0-cdh4.2.1

Have tried all of

mapreduce.task.classpath.user.precedence
mapreduce.task.classpath.first
mapreduce.job.user.classpath.first
mapreduce.user.classpath.first

as part of the configuration setting for the pig action.

<action name="my_action">
    <pig>
        <configuration>
             <property>
                  <name>mapreduce.job.user.classpath.first</name>
                   <value>true</value>
              </property>

But none of them seem to work. The problem is that pig (somehow) depends on guava 11 and my job depends on guava 13. I want to push my job's jars first in the classpath. EDIT : I think it is oozie that depends on guava11

Can't get this to work ? any pointers ?

Adding more info : After going through the logs i see that the

mapred.job.classpath.files // has guava13 first in the classpath
mapred.cachefiles // has guava13 first in the classpath

However when org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher launches the job; then the JobControl logs for Zookeeper have guava 11 first in the classpath

[JobControl] INFO  org.apache.zookeeper.ZooKeeper  - Client environment:java.class.path= // has guava 11 first in the classpath !

Solution

  • for oozie/pig/hive

    <property>
        <name>oozie.launcher.mapreduce.task.classpath.user.precedence</name>
        <value>true</value>
    </property>
    <property>
        <name>mapreduce.task.classpath.user.precedence</name>
        <value>true</value>
    </property>