Search code examples
apache-sparkhiveapache-spark-sqloozieoozie-workflow

Unable to schedule job in oozie. Getting Error while creating HiveContext


Trying to run a spark job from oozie. Below is the code which I am trying to run.

SparkConf conf = getConf(appName);
JavaSparkContext sc = new JavaSparkContext(conf);
HiveContext hiveContext = new HiveContext(sc);

I am getting the following error:

JOB[0000000-170808082825775-oozie-oozi-W] ACTION[0000000-170808082825775-oozie-oozi-W@Sample-node] Launcher exception: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)

Here's my workflow xml file

<workflow-app name="DataSampling" xmlns="uri:oozie:workflow:0.4">
    <start to='Sample-node'/>
    <action name="Sample-node">
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
                        <configuration>
                                <property>
                                        <name>tez.lib.uris</name>
                                        <value>/hdp/apps/2.5.3.0-37/tez/tez.tar.gz</value>
                                </property>
                        </configuration>
            <master>${master}</master>
            <mode>${mode}</mode>
            <name>Sample class on Oozie - Sampling</name>
            <class>Sampling</class>
            <jar>/path/jarfile.jar</jar>
            <arg>${numEventsPerPattern}</arg>
            <arg>${eventdate}</arg>
            <arg>${eventtype}</arg>
            <arg>${user}</arg>
        </spark>
        <ok to="end"/>
        <error to="fail"/>
    </action>
   <kill name="fail">
        <message>Workflow failed, error
            message[${wf:errorMessage(wf:lastErrorNode())}]
        </message>
    </kill>
    <end name='end'/>
</workflow-app>

I am using Hortonworks Data Platform 2.5. Can any one please help if I am missing some thing in the classpath.

Thanks in advance.


Solution

  • Finally it worked. Oozie is able to create HiveContext.

    Issue is with classpath. Delete the folder /user/oozie/share/lib in hdfs.

    Update the following properties in Ambari under core-site.xml Set the following properties to *

    hadoop.proxyuser.oozie.groups
    hadoop.proxyuser.oozie.hosts
    hadoop.proxyuser.root.groups
    hadoop.proxyuser.root.hosts
    

    Created new shared library using the following command:

    /usr/hdp/current/oozie-client/bin/oozie-setup.sh sharelib create -fs /user/oozie/share/lib
    

    Restart oozie service

    Above 2 steps should be done using oozie user

    Added the following tags to work flow xml file

    <spark-opts>--num-executors 6 --driver-memory 8g --executor-memory 6g</spark-opts>
    

    Run the oozie job as hdfs user.