Search code examples
apache-sparkapache-zeppelinvora

Zeppelin fails; Class UTF8String.class is different between Vora and Spark 1.5.2 libraries


I installed Vora 1.1. Patch 1 on HDP 2.3 with Spark 1.5.2, on SLES 11 SP3. It's not precisely the configuration mentioned in the Note 2213226, but shell-version of Vora seems to be working properly with the test 2.7 of the Installation manual (the latter didn't prescribe HDP versions depending on the OS version, hence I went for HDP2.3 under SLES).

I have problems with Zeppelin, though. The github installation of version 0.5.6 seems to be successful, and I can execute the "create table" statement in Zeppelin notepad, but when executing "show tables" statement I get error:

Error: Job aborted due to stage failure: Task 0 in stage 12.0 failed 4 times,
    most recent failure: Lost task 0.3 in stage 12.0 (TID 36, eba156.extendtec.com.au):
    java.io.InvalidClassException: org.apache.spark.unsafe.types.UTF8String; local
    class incompatible: stream classdesc serialVersionUID = 7459647620003804432, 
    local class serialVersionUID = 7786395165093970948 
  at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621)
  at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
  at 
(blablabla)

I believe I found the reason why:

  • The class UTF8String.class coming from the library spark-sap-datasources-1.2.10-assembly.jar (and then used by Zeppelin) is dated Jan 20 and has size 17919 bytes.
  • The class UTF8String.class contained in the Spark's 1.5.2. library is dated Dec 16 and has size 18653

So I guess versions of these libraries do not match.

How should I proceed?

Thanks!


Solution

  • Up to Vora1.1 Patch 1 the Spark 1.5.2 version that comes with HDP2.3.4 is not officially supported (the HDP-Spark1.5.2 version is slightly different from the Apache Spark1.5.2 version). There are 2 known issues with the Thriftserver and Zeppelin. Easiest workaround is to install Apache Spark 1.5.2 outside of Ambari and not use the HDP-Spark version.

    As of Vora 1.2 (released March 31 2016) both issues with the HDP-Spark 1.5.2 version are resolved and Vora is fully compatible with it.