Search code examples
apache-sparkcassandracassandra-3.0spark-cassandra-connector

Exception in thread "main" java.lang.IllegalStateException: Cannot retrieve files with 'spark' scheme without an active SparkEnv


I'm very new to spark and cassandra, got one sample from github and tried to run the application from the below link

spark-on-cassandra-quickstart

After jar file generated, Tried executing with the below syntax

C:\Users\user\Desktop\softwares\spark-2.4.3-bin-hadoop2.7\spark-2.4.3-bin-hadoop2.7\bin>spark-submit --class com.github.boneill42.JavaDemo --master spark://localhost:7077
C:\Users\user\git\spark-on-cassandra-quickstart\target/spark-on-cassandra-0.0.1-SNAPSHOT-jar-with-dependencies.jar spark://localhost:7077 localhost

Below is the issue I'm facing

19/06/08 22:59:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.IllegalStateException: Cannot retrieve files with 'spark' scheme without an active SparkEnv.
        at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:690)
        at org.apache.spark.deploy.DependencyUtils$.downloadFile(DependencyUtils.scala:137)
        at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367)
        at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367)
        at scala.Option.map(Option.scala:146)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:366)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Please help me in resolving the issue


Solution

  • In your case, It seems you want to start in standalone mode

    spark://HOST:PORT   Connect to the given Spark standalone cluster master.
    The port must be whichever one your master is configured to use, which is 7077 by default. 
    
    

    Do you start spark master and worker first ?

    launch master

    ./sbin/start-master.sh
    

    launch worker

    ./bin/spark-class org.apache.spark.deploy.worker.Worker  spark://localhost:7077 -c 1 -m 512M
    

    After start master and worker, then you can submit your job again.