Search code examples
cassandradatastax-enterprisedatastax-java-driver

Cannot execute this Scala application via dse spark-submit


I am trying to use the following command from a DSE client node in order to send a Scala application fat-jar to a single-node Datastax environment.

dse -u cassandra -p <password> spark-submit --class my.package.bde.TestSparkApp target/big-data-engine-0.0.1-jar-with-dependencies.jar --master dse://10.0.0.4?

Also, in the /etc/dse/spark-defaults.conf , among others, I have put the following line:

spark.cassandra.connection.host 10.0.0.4
spark.cassandra.auth.username cassandra
spark.cassandra.auth.password <pass>
com.datastax.bdp.fs.client.authentication.basic.username cassandra
com.datastax.bdp.fs.client.authentication.basic.password <pass>

However, the output of the command is the following.

Anybody knows how it could be fixed?

    WARN  2019-05-24 09:01:58,116 com.datastax.driver.core.NettyUtil: Found Netty's native epoll transport in the classpath, but epoll is not available. Using NIO instead.
java.lang.NoSuchMethodError: Method io.netty.channel.epoll.NativeStaticallyReferencedJniMethods.efdnonblock()I name or signature does not match
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
    at java.lang.Runtime.load0(Runtime.java:809)
    at java.lang.System.load(System.java:1086)

WARN  2019-05-24 09:02:02,079 hive.ql.metadata.Hive: Failed to access metastore. This class should not accessed in runtime.
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
    at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
    at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
    at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:191)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:362)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:266)
    at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
    at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:195)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195)
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
    at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)
    at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)
    at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
    at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1059)
    at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:137)
    at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:136)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:136)
    at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:133)
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:171)
    at org.apache.spark.sql.Dataset$.apply(Dataset.scala:62)
    at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:466)
    at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
    at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:213)
    at tech.metis.bde.TestSparkApp$.delayedEndpoint$tech$metis$bde$TestSparkApp$1(TestSparkApp.scala:35)
    at tech.metis.bde.TestSparkApp$delayedInit$body.apply(TestSparkApp.scala:20)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
    at scala.App$class.main(App.scala:76)
    at tech.metis.bde.TestSparkApp$.main(TestSparkApp.scala:20)
    at tech.metis.bde.TestSparkApp.main(TestSparkApp.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.DseSparkSubmit$.org$apache$spark$deploy$DseSparkSubmit$$runMain(DseSparkSubmit.scala:662)
    at org.apache.spark.deploy.DseSparkSubmit$.doRunMain$1(DseSparkSubmit.scala:167)
    at org.apache.spark.deploy.DseSparkSubmit$.submit(DseSparkSubmit.scala:192)
    at org.apache.spark.deploy.DseSparkSubmit$.main(DseSparkSubmit.scala:106)
    at org.apache.spark.deploy.DseSparkSubmitBootstrapper$.main(DseSparkSubmitBootstrapper.scala:106)
    at org.apache.spark.deploy.DseSparkSubmitBootstrapper.main(DseSparkSubmitBootstrapper.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
    at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
    ... 57 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
    ... 63 common frames omitted
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Got exception: com.datastax.bdp.fs.client.DseFsConnectionException Could not connect to DSEFS. Last endpoint tried: 10.0.0.4:5598
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1213)
    at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:106)
    at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:140)
    at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
    at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
    at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:601)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
    at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
    ... 68 common frames omitted
WARN  2019-05-24 09:02:02,216 io.netty.channel.ChannelInitializer: Failed to initialize a channel. Closing: [id: 0x61f0844e]
java.lang.VerifyError: Bad invokespecial instruction: current class isn't assignable to reference class.
Exception Details:
  Location:
    com/datastax/bdp/fs/rest/client/StartTlsClientHandler.decode(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http/HttpObject;Ljava/util/List;)V @4: invokespecial
  Reason:
    Error exists in the bytecode
  Bytecode:
    0x0000000: 2a2b 2c2d b701 72b2 003c 2db9 0178 0100
    0x0000010: 04a3 0007 04a7 0004 03b6 017b 2db9 017e
    0x0000020: 0100 9a00 1c2d 03b9 0181 0200 c001 373a
    0x0000030: 042d b901 8401 002a 2b19 04b7 0186 b1  
  Stackmap Table:
    same_locals_1_stack_item_frame(@24,Object[#56])
    full_frame(@25,{Object[#2],Object[#80],Object[#396],Object[#372]},{Object[#56],Integer})
    same_frame(@62)

    at com.datastax.bdp.fs.rest.client.RestClientConnection$ClientChannelInitializer.initChannel(RestClientConnection.scala:568)
    at com.datastax.bdp.fs.rest.client.RestClientConnection$ClientChannelInitializer.initChannel(RestClientConnection.scala:563)

ERROR 2019-05-24 09:02:04,434 hive.log: Got exception: com.datastax.bdp.fs.client.DseFsConnectionException Could not connect to DSEFS. Last endpoint tried: 10.0.0.4:5598
com.datastax.bdp.fs.client.DseFsConnectionException: Could not connect to DSEFS. Last endpoint tried: 10.0.0.4:5598
    at com.datastax.bdp.fs.client.RestClientProxy.<init>(RestClientProxy.scala:159)
    at com.datastax.bdp.fs.client.DseFsClient.liftedTree1$1(DseFsClient.scala:62)
    at com.datastax.bdp.fs.client.DseFsClient.<init>(DseFsClient.scala:62)
    at com.datastax.bdp.fs.hadoop.DseFileSystem.initialize(DseFileSystem.scala:118)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:362)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:104)
    at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:140)
    at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
    at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
    at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:601)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
    at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
    at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:191)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:362)
    at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:266)
    at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
    at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:195)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195)
    at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195)
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
    at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)
    at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)
    at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
    at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
    at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1059)
    at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:137)
    at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:136)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:136)
    at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:133)
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:171)
    at org.apache.spark.sql.Dataset$.apply(Dataset.scala:62)
    at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:466)
    at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
    at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:213)
    at tech.metis.bde.TestSparkApp$.delayedEndpoint$tech$metis$bde$TestSparkApp$1(TestSparkApp.scala:35)
    at tech.metis.bde.TestSparkApp$delayedInit$body.apply(TestSparkApp.scala:20)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
    at scala.App$class.main(App.scala:76)
    at tech.metis.bde.TestSparkApp$.main(TestSparkApp.scala:20)
    at tech.metis.bde.TestSparkApp.main(TestSparkApp.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.DseSparkSubmit$.org$apache$spark$deploy$DseSparkSubmit$$runMain(DseSparkSubmit.scala:662)
    at org.apache.spark.deploy.DseSparkSubmit$.doRunMain$1(DseSparkSubmit.scala:167)
    at org.apache.spark.deploy.DseSparkSubmit$.submit(DseSparkSubmit.scala:192)
    at org.apache.spark.deploy.DseSparkSubmit$.main(DseSparkSubmit.scala:106)
    at org.apache.spark.deploy.DseSparkSubmitBootstrapper$.main(DseSparkSubmitBootstrapper.scala:106)
    at org.apache.spark.deploy.DseSparkSubmitBootstrapper.main(DseSparkSubmitBootstrapper.scala)
Caused by: com.datastax.bdp.fs.rest.client.RestConnectException: Failed to open connection to 10.0.0.4:5598
    at com.datastax.bdp.fs.rest.client.RestClientConnection$$anonfun$2.applyO
    io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:405)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.ExtendedClosedChannelException: null
    at io.netty.channel.AbstractChannel$AbstractUnsafe.ensureOpen(...)(Unknown Source)
ERROR 2019-05-24 09:02:04,438 hive.log: Converting exception to MetaException
Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1062)
    at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:137)
    at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:136)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:136)
    at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:133)
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:171)
    at org.apache.spark.sql.Dataset$.apply(Dataset.scala:62)
    at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:466)
    at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
    at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:213)
    at tech.metis.bde.TestSparkApp$.delayedEndpoint$tech$metis$bde$TestSparkApp$1(TestSparkApp.scala:35)
    at tech.metis.bde.TestSparkApp$delayedInit$body.apply(TestSparkApp.scala:20)
    at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.App$$anonfun$main$1.apply(App.scala:76)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
    at scala.App$class.main(App.scala:76)
    at tech.metis.bde.TestSparkApp$.main(TestSparkApp.scala:20)
    at tech.metis.bde.TestSparkApp.main(TestSparkApp.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.DseSparkSubmit$.org$apache$spark$deploy$DseSparkSubmit$$runMain(DseSparkSubmit.scala:662)
    at org.apache.spark.deploy.DseSparkSubmit$.doRunMain$1(DseSparkSubmit.scala:167)
    at org.apache.spark.deploy.DseSparkSubmit$.submit(DseSparkSubmit.scala:192)
    at org.apache.spark.deploy.DseSparkSubmit$.main(DseSparkSubmit.scala:106)
    at org.apache.spark.deploy.DseSparkSubmitBootstrapper$.main(DseSparkSubmitBootstrapper.scala:106)
    at org.apache.spark.deploy.DseSparkSubmitBootstrapper.main(DseSparkSubmitBootstrapper.scala)
Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;

Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
    ... 54 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
    ... 60 more
Caused by: MetaException(message:Got exception: com.datastax.bdp.fs.client.DseFsConnectionException Could not connect to DSEFS. Last endpoint tried: 10.0.0.4:5598)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1213)
    at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:106)
    at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:140)
    at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
    at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
    at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:601)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
    at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
    ... 65 more

A part of the output has been ommitted.


Solution

  • The most probable problem is that you have compiled your fat jar with Spark or Scala versions that are different that is used by DSE, so it has incompatible components.

    Usually, it's enough to specify only following dependency in your build file (it's located in DataStax repository):

        <dependency>
          <groupId>com.datastax.dse</groupId>
          <artifactId>dse-spark-dependencies</artifactId>
          <version>6.0.7</version>
          <scope>provided</scope>
        </dependency>
    

    and that's all (of course, you need to add your application dependencies) - just compile, get jar file, and run. You can find more examples of building for DSE in DataStax repo.

    Also, please note that everything that you're specifying after the name of the jar file is treated as application parameters, so you may need to move --master before name of jar file