I have 1 master with 3 worker nodes communicating to master.
As a disaster recovery we have created 2 Masters and let zookeeper elect the master. I am using datastax's spark Cassandra connector. Is there a way to pass multiple Spark Master URLs to try serially which ever succeeds.
new SparkConf(true)
.set("spark.cassandra.connection.host", "10.3.2.1")
.set("spark.cassandra.auth.username","cassandra")
.set("spark.cassandra.auth.password",cassandra"))
.set("spark.master", "spark://1.1.2.2:7077") // Can I give multiple Urls here?
.set("spark.app.name","Sample App");
tl;dr Use comma to separate host:port
entries, e.g. spark://localhost:7077,localhost:17077
Please note that you should avoid hardcoding connection details as they are part of operations and should really be defined using spark-submit's --master
command-line option:
$ ./bin/spark-submit --help
Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
See the relevant Spark code where the parsing happens:
val masterUrls = sparkUrl.split(",").map("spark://" + _)
while sparkUrl
is matched using """spark://(.*)""".r
regex.