Search code examples
hiveooziehiveqloozie-coordinator

Hive on spark doesn't work in hue


I am trying to trigger hive on spark using hue interface . The job works perfectly when run from commandline but when i try to run from hue it throws exceptions. In hue, I tried mainly two things:

1) when I give all the properties in .hql file using set commands

set spark.home=/usr/lib/spark;
set hive.execution.engine=spark; 
set spark.eventLog.enabled=true;
add jar /usr/lib/spark/assembly/lib/spark-assembly-1.5.0-cdh5.5.1-hadoop2.6.0-cdh5.5.1.jar;
set spark.eventLog.dir=hdfs://10.11.50.81:8020/tmp/;
set spark.executor.memory=2899102923;

I get an error

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Unsupported execution engine: Spark.  Please set hive.execution.engine=mr)'
org.apache.hadoop.hive.ql.metadata.HiveException: Unsupported execution engine: Spark.  Please set hive.execution.engine=mr

2) when I give properties in hue properties, it just works with mr engine but not spark execution engine.

Any help would be appreciated


Solution

  • I have solved this issue by using a shell action in oozie. This shell action invokes a pyspark action bearing my sql file.

    Even though the job shows as MR in jobtracker, spark history server recognizes as a spark action and the output is achieved.

    shell file:

    #!/bin/bash
    export PYTHONPATH=`pwd`
    spark-submit --master local testabc.py
    

    python file:

    from pyspark.sql import HiveContext
    from pyspark import SparkContext
    sc = SparkContext();
    sqlContext = HiveContext(sc)
    result = sqlContext.sql("insert into table testing_oozie.table2 select * from testing_oozie.table1 ");
    result.show()