Search code examples
javaapache-sparkhadoophive

Hive failing to create spark session


I tried a lot and read a lot of docs of spark and hive. Things dont even match up. For example https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started in this docs of hive, hive says that 2.3.x is tested for spark 2.0.0 but in spark docs ( https://spark.apache.org/docs/3.3.2/building-spark.html ) it says that spark 3.3.2 is built for hive 2.3.9 Information of docs mismatch and also when I try with hive 2.3.9 on spark 3.3.2, I am getting errors when hive is trying to create spark session.

This error shows up when i do like

select count(*) from table_name;

Error

Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' Failed: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark client.

So i checked the logs of hive and here are some information that you may need. FYI, my spark runs on port 4040 in localhost.

WARN TransportChannelHandler : Exception in connection from localhost/127.0.0.1:4040 java.lang.IllegalArgumentException: Too large frame: 5211883372140375593 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from localhost/127.0.0.1:4040 is closed Exception in thread "main" org.apache.spark.SparkException: Exception thrown in awaitResult ERROR [f0fd81a8-0d73-43d0-814e-bda0253c132a main] client.SparkClientImpl: Error while waiting for client to connect. java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client 'cc1f8b8a-0bfd-4574-9b6d-90bb0238a71e'. Error: Child process exited before connecting back with error log Warning: Ignoring non-Spark config property: hive.spark.client.server.connect.timeout

To see full error log -> https://pastebin.com/CiuHRCsy


Solution

  • I found the reason for that error. In hive-site.xml which is in $HIVE_HOME/conf, I changed the value of configuration spark.submit.deployMode from cluster to client which solved the issue as I am running spark and hive locally.